Pavel Smaglo — Data Scientist & AI Automation

SEC / 03

Selected cases

Six projects — from interactive analytics dashboards and a global swimming-development world map to autonomous AI pipelines on n8n.

CASE 01Data Engineering & Visualization

A development map of water sports in Russia

An interactive analytics dashboard for the leadership of the Russian Water Sports Federation: from scattered Excel files to a single data-driven decision system.

Problem

Data on athletes, coaches, venues and results was stored across dozens of separate files and systems. Preparing a single analytics report took weeks of manual work. Leadership had no holistic view of the industry.

Solution

Designed a relational DB (15+ tables, PostgreSQL) to unify all sources
Geocoding — placed 3,600+ pools across Russia on an interactive map
A composite index of regions from 124 absolute and 60 aggregated metrics
Dashboard with regional rankings, heatmaps and filters
Index system: infrastructure availability, staffing, participation

Result

Leadership got a decision-making tool: where infrastructure is overloaded, where it's underused, which regions are drivers. The ability to model the effect of investments.

PythonPostgreSQLClickHouseDataLensPower BIGeoJSONETL

Open the interactive map →

smaglo.space/dashboard

Interactive map of water-sports development across Russia (FVVSR)

3M+

records
in db

10+

data
sources

3,600+

pools
on map

CASE 02Data Research & Geo-Visualization

A world map of swimming development

From a Russia dashboard to the global picture: an interactive world map that ranks 198 countries by a Swimming-Education Development Index (SDI), built on deep research of official sources.

Problem

There was no single, comparable picture of how countries develop mass swimming education and water safety. The data was scattered across national federations, ministries and methodology documents in dozens of languages.

Solution

Designed the SDI index — 8 weighted criteria (0–5 scale, weights sum to 100) with a transparent formula
Deep-research profiles for 198 countries from official sources: federations, ministries, statistics, methodologies
Interactive D3 + TopoJSON world map: country search, tooltips, side panel and colour legend
A drill-down page per country plus an open, documented scoring methodology

Result

A single tool to compare countries, separate leaders from laggards and ground decisions in transparent, reproducible scoring.

D3.jsTopoJSONGeoJSONDeep ResearchIndex DesignJavaScript

Open the world map →

smaglo.space/swimming

Interactive world map of swimming-education development (SDI)

198

countries
indexed

scoring
criteria

0–100

index
scale

CASE 03Machine Learning & Interpretability

Predictive model: who wins a medal?

A model that estimates each swimmer's medal chances from their competition history and result dynamics.

Problem

The federation funds the training of hundreds of athletes, but the budget is limited. Coaches pick candidates intuitively — subjectively and without common criteria.

Solution

Compared four ML algorithms and chose the best by cross-validation (LightGBM)
Tuned hyperparameters over 40 iterations (RandomizedSearchCV)
Identified 8 key factors out of 13 via SHAP feature importance
Checked class imbalance effect (SMOTE/SMOTENC) — the ceiling is set by data quality

Result

The model predicts a medalist with 91.6% accuracy. Five top factors: current level, peer ranking, progress rate, peak progression, consistency. Every prediction comes with an explanation.

LightGBMSHAPscikit-learnRandomizedSearchCVSMOTEPython

shap_summary.ipynb

91.6%

forecast
accuracy

success
factors

models
compared

CASE 04Computer Vision & BiomechanicsNDA

Analysing athlete technique from video

A computer-vision system that builds a digital skeleton of an athlete from video, measures joint angles and detects movement asymmetries.

Problem

A coach judges technique by eye — but misses micro-asymmetries that reduce efficiency and raise injury risk. Objective numbers are needed: angles, ranges, side balance.

Solution

17 key body points in every frame (YOLOv8 / MediaPipe)
Angles for each joint, broken down by movement phase
Automatic left/right comparison — flags asymmetries
Export to a report with charts to track progress between sessions

Result

Coaches got objective metrics instead of subjective judgement. Athletes correct movements faster and lower injury risk thanks to early imbalance diagnosis.

YOLOv8MediaPipeOpenCVnumpymatplotlib

pose_analysis.mp4 · output

up to 116

skeleton
points

realtime
charts

∞

videos
at once

CASE 05Computer Vision & Object Detection

Blood cell detection on microscope images

A neural network that finds and classifies red cells, white cells and platelets on blood-smear images — instantly and without a lab technician.

Problem

Lab technicians count blood cells by hand under a microscope — slow, tiring and error-prone. One image takes minutes, and there are dozens a day.

Solution

Faster R-CNN (ResNet50-FPN v2), fine-tuned on blood-smear images
Three classes in one pass: red cells, white cells, platelets
Transfer learning — adapted in 10 epochs from ImageNet weights
Each cell boxed with class and confidence

Result

The model confidently detects cells at 0.9+ confidence. One image — in a second instead of minutes of manual counting. A ready base for integration into lab systems.

Faster R-CNNResNet50-FPNPyTorchtorchvisionTransfer Learning

blood_cells_detection.png

cell
types

0.9+

model
confidence

~1s

per
image

CASE 06AI Automation & Workflow Orchestration

AI automation of business processes

n8n workflows that replace routine: they ingest documents, extract data, draft replies, update databases and dispatch results — with no human in the loop.

Problem

Teams spend hours on repetitive tasks: parsing documents, moving data between systems, drafting standard replies. Each is simple, but there are hundreds.

What's automated

Inbound document processing: file → AI extracts fields → DB + CRM
AI assistants with access to a knowledge base (RAG)
ETL pipelines: APIs/tables/files → cleaning → warehouse
Document generation from a template + data in seconds

How it works

Each scenario is a visual n8n pipeline: a trigger (webhook, schedule, file) starts a chain of steps, AI processes the data, and the result goes to the right system. 24/7, no code.

n8nOpenAI APIRAGPostgreSQLWebhookREST API

n8n workflow · execution

▶ Trigger: new file in /inbox

1Extract text from PDF✓

2GPT: classify → invoice✓

3GPT: extract fields✓

4Write to PostgreSQL✓

5Notify in Telegram✓

// done in 3.2s · no human involved

⚡ next file in 00:00:12...

12+

scenarios
in prod

24/7

with no
humans

lines of code
for the user

CASE 07AI Engineering & Agent Orchestration

An agentic pipeline for fund reports

One "Go!" turns 1000+ source documents (~1.5 GB of scans, contracts and estimates) into a fund-ready, financially-audited DOCX — orchestrated by Claude across deterministic Python and a fleet of parallel sub-agents.

Problem

Official grant reports demand a DOCX with a line-by-line financial audit, assembled from gigabytes of messy scans, contracts, payment orders and estimates. By hand it's days of tedious, error-prone work.

Solution

Claude (Opus, 1M context) orchestrates the run; Python scripts do the deterministic part — indexing, extraction, DOCX assembly, financial audit
6 parallel OCR sub-agents read scans, contracts and payment orders; 3 more parse the finances and write the narrative sections
Confidential financial and personal data is processed by local, on-device models — nothing sensitive ever leaves the machine
A document-linking graph and triangulated audit reconcile every figure against limits and primary documents
Per-row Word comments, an auto "remarks to check" doc and a clean fund-ready copy — then a mandatory self-review pass

Result

~1.5 GB of source materials → an audited, fund-ready report in a single run, every line cross-checked and commented. Days of manual work collapse into minutes.

Claude Opus (1M)Sub-agentsLocal LLMOCRpython-docxFinancial AuditOrchestration

n8n · report pipeline

Automation workflow tree of the report pipeline

parallel
sub-agents

1000+

documents
per report

command
«Go!»

CASE 08Healthcare · Privacy-by-designNDA

A closed-loop app for athlete health data

A clinical app for sports medicine: it parses lab reports into structured markers, tracks trends against a reference-value catalog, and exports PDF/XLSX — behind role-based access and a full audit trail. Special-category medical data never leaves the closed loop.

Problem

Athletes' lab results — special-category personal data — were scattered across PDFs from different labs. Doctors needed trends over time in one place, but cloud services are a non-starter for sensitive medical data.

Solution

Parses lab-report PDFs / XLSX from many labs into structured markers (athlete → report → test)
Trends over time per marker, normalized against a reference-value catalog; dashboards and PDF/XLSX export
Role-based access (admin / medic / viewer) with per-record scoping, brute-force protection and a full audit log of every action
Closed loop by design: runs on-prem / behind a corporate VPN, isolated from the public internet; encrypted off-box backups
A companion smartphone app for access on the go

Result

Doctors get longitudinal trends and instant reports; the organization keeps full control — medical data stays inside the perimeter, every access is logged, and nothing goes to the cloud.

FastAPIPythonSQLitePDF parsingRBACAudit logOn-premMobile

smaglo.space/medicine

15K+

lab tests
structured

access
roles

data leaves
the loop

SEC / 05

How to work with me

For 90% of tasks you don't need a separate dev team — one specialist plus automation covers the same work. You work with me directly: no agency markup, no team on payroll — so it costs a fraction of a team.

// The first call is free and commitment-free. Prices are a ballpark — I quote the exact figure after the diagnostic, but you see the order of magnitude upfront.

AI diagnostic

for those unsure whether they need AI and where to start

free

review + report in 3–5 days · no commitment

I map your processes and data: where AI genuinely makes money and where it's just hype
2–3 concrete scenarios for your business, with an estimate of impact, timeline and budget
I'll tell you honestly if the task is better solved without neural nets — no upselling
A short roadmap report you can act on — even with a different contractor

what you get

A clear map: what to do first, what it yields in money, and whether to start at all — before you've spent a dollar on development.

Start with a diagnostic

Turnkey build

one task, fixed price, a working result

from$2,000

fixed per project · 2–4 weeks · price known upfront

I take one task from the diagnostic and ship it as a working product — from data to interface
A dashboard, ML model, CV system or n8n automation — on your data, not a demo
Fixed price and timeline — no billing surprises, no open-ended hourly meter
Handed over with documentation, and I train your team so you don't depend on me

what you get

Your first measurable AI result inside your own systems — in weeks, not half a year. Then you decide on facts: scale, stop, or move to a subscription.

Discuss your task

recommended

AI department on subscription

a dedicated specialist in your team, month to month — no hiring, no bloated project

from$1,500/mo

monthly · pause or cancel anytime, no penalties

Your data scientist / AI engineer on call: data, ML, computer vision, AI agents
We work your tasks by priority — models, automation, dashboards, knowledge-base assistants
Ongoing development and support of what's built — not a one-off handoff and goodbye
Priority replies and included hours every month, with a transparent plan of work
Stop once your tasks are done — no penalties, no obligations

what you get

An external AI department for a fraction of the cost of a team: expertise and hands-on work month to month, small steps, and the option to leave anytime.

Discuss a subscription

// why a subscription

Why a subscription beats hiring and agencies

A dev team means several salaries, taxes, equipment and months to hire. Even one full-time data hire runs ~$8–12K/mo fully loaded — plus the risk there's no work left for them in six months.

An agency books months and a five-figure-plus budget where you pay for account managers and slide decks — even when the real work is a few weeks.

You work with me directly, and automation handles the routine: one person does the work of a small team — no markup, no idle time. Usually 5–10× cheaper, and you can start or stop anytime.

One or two shipped workflows — a knowledge-base assistant, routine automation, a dashboard — pay for a month of the subscription.

Start with a free chat. Scope and format are flexible — stop or pause at any step, no penalties or long contracts.

I don't just train
models — I take them
to production.

About me

Selected cases

A development map of water sports in Russia

Problem

Solution

Result

A world map of swimming development

Problem

Solution

Result

Predictive model: who wins a medal?

Problem

Solution

Result

Analysing athlete technique from video

Problem

Solution

Result

Blood cell detection on microscope images

Problem

Solution

Result

AI automation of business processes

Problem

What's automated

How it works

An agentic pipeline for fund reports

Problem

Solution

Result

A closed-loop app for athlete health data

Problem

Solution

Result

Tech stack

How to work with me

Why a subscription beats hiring and agencies

Got a data
challenge ↗

I don't just trainmodels — I take themto production.

About me

Selected cases

A development map of water sports in Russia

Problem

Solution

Result

A world map of swimming development

Problem

Solution

Result

Predictive model: who wins a medal?

Problem

Solution

Result

Analysing athlete technique from video

Problem

Solution

Result

Blood cell detection on microscope images

Problem

Solution

Result

AI automation of business processes

Problem

What's automated

How it works

An agentic pipeline for fund reports

Problem

Solution

Result

A closed-loop app for athlete health data

Problem

Solution

Result

Tech stack

How to work with me

Why a subscription beats hiring and agencies

Got a datachallenge ↗

I don't just train
models — I take them
to production.

Got a data
challenge ↗