Sabbir Hossain
Hi, I'm

Sabbir Hossain

Data engineer. Backend-minded. Platform-first.

I build data and backend systems that need to hold up in the real world. Bell Canada is where I'm doing that now; Johns Hopkins and UofT are the research base I still bring into the work.

roleData engineer / Platform
statusOpen to the right team
open to right roledata engineerdata platform engineerbackend + infraai engineer
mobilityCanada based, US-ready
Canadian citizenToronto, CanadaTN-eligible
productionBell ownership
Bell CanadaNTS ownerCS Attack RCA
researchResearch systems
Johns HopkinsUofT research750+ TB
proofOutside proof
Harvard NCRCABRCMS awards3 manuscripts
nextNext team fit
platform-firstowns the loopbackend-minded
Start at the top and read down: UofT, Hopkins, awards, Bell, shipped work, and the kind of role I am aiming for next.

One career graph: what happened, when, and what it proves.

dag career_v9_time_dagschedule @continuousowner @sabbirread_mode time_enhancedrange 2015 - next3 running1 queued

Time-enhanced vertical career pipeline DAG

legend
formationsourceresearchsignalindustryawaiting
mobile DAGtap a layer or swipe the map
real-world classroom
the long way in

Formation Layer

Started Life Sciences, took the long road to figure out the right fit. Odd jobs, course corrections, and the kind of lessons school does not teach - then a clean pivot into the program that worked.

[ 2015 UofT Life Sciences starts ][ 2017 odd jobs + real-world detour ][ 2018 pivot into BioInfo + CompBio specialist ]
specialist degree
specialist base

Source Layer

Bioinformatics & Computational Biology specialist - the right fit. Where the technical base actually got built: CS, bioinformatics, immunology, and systems thinking.

[ 2018 BioInfo + CompBio specialist starts ][ 2022 research roles begin ][ 2024 BSc honours - 3.96 major GPA ]
research runtime
systems before industry

Research Platform Layer

Where research software started becoming production-minded engineering.

[ 2019 UofT research software ][ 2022 Hopkins oncology platform ]
outside proof
outside the code

External Signal Layer

Awards, talks, and manuscripts that made the work visible outside the code.

[ 2023 ABRCMS oral award ][ 2024 Harvard NCRC plenary ][ 2024 ABRCMS poster award ][ 2026 3 manuscripts in review ]
owner + outputs
bell canada · data eng. & AI

Production Runtime

One job. Real ownership of the pipelines, and the live outputs stakeholders actually use.

Ownershippipelines I own end-to-end
Live outputswhat stakeholders consume
[ 2025 Bell Data Eng. & AI ][ now NTS owner + RCA lead ]
platform step
queued

Next Target Layer

The kind of role I want next.

[ next data platform / backend ][ next architecture with ownership ]
I want the first read to answer the obvious questions: what I own now, what I have shipped, and where the receipts are.

The hiring case, without making you dig.

production owner

I am comfortable in the messy middle

Pipeline failures, unclear ownership, historical data recovery, and executive-facing RCA work are not side quests for me. That is the kind of work I know how to carry.

NTS owner78,000+ records recovereddirector + VP RCA
research engineer

I bring the research habits with me

The research background shows up in how I build: provenance, reproducibility, scale, and evidence before hand-waving.

750+ TB multi-omics100+ users3 manuscripts
platform trajectory

I like systems that compound

The best next fit is data engineering with backend, infrastructure, and platform ownership: the kind of work that makes teams faster and systems easier to trust.

backend overlapdata contractsoperational design
best-fit role

Data engineering, with platform ownership.

Bell Canada is where I'm doing production work now. The next role should use all of it: data systems, backend judgment, research discipline, and ownership that helps more than one team.

  • scale - systems with real users, load, and SLOs
  • ownership - messy systems I can make reliable
  • platform - work that helps more than one team
  • craft - tests, reviews, runbooks, and post-mortems
Read the evidence
Every role, project, and proof point on one canvas — click any node, land on the receipts.

Want to trace the work? Open the map.

Best-fit lane: Data engineering / Backend systems / Platform thinking / ETL / ELT pipelines / Data warehousing.
Experience

Bell is where I'm focused now. Hopkins and UofT built the depth.

Open the archive
📡 Industry

Data Engineer

Bell Canada

Primary owner of Bell's Network Ticket Service pipeline, with growing ownership across CS Attack RCA work and CSAP delivery for cross-country Bell stakeholders.

78attributes delivered
78,000+records recovered
83%query optimization
  • Built and productionized the mission-critical Network Ticket Service data pipeline on Teradata using a three-tier ETL and ELT architecture across staging, warehouse, and analysis layers.
  • Integrated four operational systems including REST API event streams, ERP, billing, and directory services using Python and SAS Data Integration while enforcing data contracts and Kimball-style dimensional modeling patterns.
  • Built a stateful sessionization algorithm in Python to fix event sequencing defects, refactoring a flawed sequential method into a robust two-pass group-by propagation model.
Read the full breakdown
🧬 Research

Bioinformatics Software Development Research Assistant

Johns Hopkins University

Ongoing spare-time oncology research, full-stack bioinformatics platforms, and ML-driven multi-omics analysis on HPC infrastructure.

750+ TBdata integrated
8novel biomarkers
83%load time reduction
  • Reduced analysis load times by 83 percent through optimized caching on a full-stack bioinformatics platform supporting 100+ global researchers.
  • Built the platform using Python, R, JavaScript, and C with microservices architecture, SOLID principles, and Docker containerization.
  • Engineered scalable ETL pipelines processing over 750 terabytes of multi-omics data on HPC clusters, accelerating biomarker discovery by 40 percent.
Read the full breakdown
🎓 Research

Software Development Research Assistant

University of Toronto

The foundation: software platforms, reproducible research tooling, and workflow automation across multiple wet-lab teams.

30+hours saved weekly
7research teams
50%setup time reduction
  • Reduced analysis effort by more than 30 hours per week across 7 research teams by engineering full-stack bioinformatics platforms.
  • Built automation using Python, R, C, and Java with object-oriented programming patterns to streamline lab workflows.
  • Owned the full software development life cycle from requirements through deployment and maintenance.
Read the full breakdown
Projects

Selected systems that show how I build when the work has to ship.

View all projects
📊Bell Canada

Enterprise Analytics Platform

78-attribute MicroStrategy analytics platform integrating SmartPath, Maximo, IPACT, and LDAP into one decision surface.

Built derived metrics, conditional formatting, cross-filter interactivity, and a structured migration path from development to production with director sign-off.

🔮Bell Canada

NTS/MS Archway Pipeline

Three-tier ETL pipeline integrating SmartPath API, Maximo, IPACT, and LDAP into unified Control Plan reporting.

Processes 150,000+ records in roughly 20 minutes across staging, warehouse, and analytics layers using schema-aware loads across DEV, QA, and PROD.

🔍Bell Canada

Data Quality Recovery System

Full-stack RCA effort that corrected historical data integrity drift and restored analytical confidence.

Executed staged historical recasts correcting 78,000+ records, expanding analytical coverage from 1 month to 9+ months and improving match accuracy to the strongest level since inception.

🧬Johns Hopkins University

Bioinformatics Platform

Open-source full-stack bioinformatics platform used by researchers for visualization, simulation, and analysis workflows.

Built with React, D3.js, R Shiny, Python, WebSockets, and Docker using microservices architecture and SOLID design principles.

🧪Johns Hopkins University

Multi-Omics Data Pipeline

Scalable processing pipeline integrating DISQOVER, ENCODE, PCAWG, PRIDE, and TCGA data for cancer biomarker analysis.

Applied SVM-RFE, Random Forest, and HPC workflows to help identify 8 novel biomarkers and accelerate validation timelines.

🧼Independent build

ProofMark Studio

Hub for the ProofMark document-craft tool line — one catalog of ~50 PDF, text, and publishing utilities built as a single React SPA over a thin FastAPI shell.

Three sibling FastAPI apps (hub, proofmark-pdf, text-cleaner) composed by URL rather than imports, so each surface stays independently editable, deployable, and testable. The hub renders the catalog, routes to each tool, and shares a design system of color tokens and SVG illustrations across tools.

Third-party proof

Outside proof that the work held up in front of real review.

TalksNational-stage speaking, selected from real competition.

Plenary and oral selections from competitive applicant pools across two national venues.

Harvard NCRC plenary '241 of 12 from 5,000ABRCMS oral award '23Comp + Systems Bio division
Verify the talks
PostersPoster recognition outside my own team.

Poster awards across graduate and national research divisions, beyond the code itself.

ABRCMS poster award '24Graduate divisionHarvard NCRC poster '24Multi-omics + ML
Verify the posters
ManuscriptsLead-author work, in review.

Three manuscripts in active review, written from the technical work behind the research.

3 lead-author papersUnder peer reviewPublic proof in archive
Open the archive
Review the full awards archive
Stack

The tools I actually reach for when the work has to land.

Languages

Core languages for backend, scripting, and analytical work.

PythonSQLJavaJavaScriptTypeScriptRCBash
Data & ML

ETL pipelines, dimensional modeling, and ML workflows — the data plumbing that has to hold up under real load.

PandasPyTorchTensorFlowScikit-learnApache SparkKafkaAirflow
Cloud & DevOps

Deployment, orchestration, infrastructure, and platform operations.

AWSGCPBigQueryDockerKubernetesTerraformGitLinux
Web & Data Stores

Apps, APIs, and the data systems behind them.

ReactNext.jsNode.jsPostgreSQLMongoDBRedisGraphQL
Analytics & Visualization

BI, charting, and exploratory analysis.

MicroStrategyD3.jsTableauPower BIJupyterExcel
Methodology & Delivery

Planning, review, testing, and shipping without chaos.

JiraConfluenceAgileScrumKanbanCI/CDTDD
Foundation

Strong CS and bioinformatics foundations, plus a lot of range.

Schooling

University of Toronto

Campus
St. George Campus
Degree
Bachelor of Science (Honours)
Graduated
June 2024
Specialist
Computer Science, Bioinformatics & Computational Biology
Minor
Immunology
Major GPA
3.96 / 4.0
Coursework highlights

Computer Science

CSC108H1 / CSC148H1 / CSC165H1 / CSC207H1 / CSC209H1 / CSC236H1 / CSC263H1 / CSC373H1

Bioinformatics & Computational Biology

BCH441H1 / BCB410H1 / BCB420H1 / BCB330Y1 / BCB430Y1

Mathematics & Statistics

MAT135H1 / MAT136H1 / STA247H1 / STA237H1

Biochemistry & Immunology

BCH210H1 / BCH311H1 / IMM250H1 / IMM340H1 / IMM350H1

See the complete course history
Context

More context, if you want the longer read.

I am a Data Engineer at Bell Canada on the Data Engineering & Artificial Intelligence team. Most of my day-to-day is production pipeline ownership, analytics platform work, cross-domain debugging, and making sure the data layer holds up when people depend on it.

Right now

Data Engineer at Bell Canada

Bell Business Markets, Data Engineering & Artificial Intelligence team

Research background

5 yrs 9 mo pre-industry

University of Toronto and Johns Hopkins across software, ML, and bioinformatics

Academic foundation

Honours BSc, 3.96 major GPA

Computer Science + Bioinformatics specialist with an Immunology minor

Why this all fits together

Before Bell, I spent 5 years and 9 months building research software across the University of Toronto and Johns Hopkins. I still continue some Hopkins research in my spare time because I genuinely enjoy the work. That path led to Harvard NCRC, oral and poster presentation wins at ABRCMS, 750+ TB of multi-omics data, and three lead-author manuscripts now under review.

The best fit for me right now is primary data engineering work, with a clear path toward data platform engineering and strong overlap with backend or infrastructure-heavy software roles. I like clear abstractions, durable systems, and solving messy technical problems without turning them into a circus.

Best fit
Data engineeringBackend systemsPlatform thinkingETL / ELT pipelinesData warehousingSQL optimizationDimensional modelingCloud architectureCI / CDTechnical leadership
Still a human being
🧠 Learning Mathematics💻 Coding🚀 Space🧬 Bioinformatics🤝 Mentoring🍳 Cooking📚 Reading🎮 Gaming🏃 Fitness
Home base

Canada

Canadian citizen. Fully authorized to work in Canada.

Based in Toronto, Ontario. Home base is clear, with flexibility for the right team setup.

NEXUS card holder. Cross-border travel is easy when the work needs it.

US work status

United States

TN visa eligible. No sponsorship track, lottery, or employer immigration cost burden.

Open to US relocation.

Open to long-term paths. H-1B or green card sponsorship is fine if the role grows that way.

Contact

If you're hiring for serious technical ownership, let's talk.

I'm focused on data engineering, platform, backend, and software roles where architecture, reliability, and follow-through actually matter.