BDJobs Scraper • Aryan Rahman: Building Reality

A scraper for BDJobs.com that collects job market data from Bangladesh (salary, requirements, company info, etc.). It extracts 31+ fields per job.

Disclaimer: This scraper is bound to be brittle and may break if any of the APIs change. For educational and research purposes only.

TL;DR

BDJobs moved to an Angular SPA, so HTML scraping got painful. I switched to their internal JSON APIs.
Pulled ~5.5k job details in ~10 minutes using async batching + retries.

Tech Stack: Python asyncio aiohttp BeautifulSoup4

How it works

BDJobs migrated from server-rendered pages to an Angular SPA, which broke my original HTML scraper. But digging through the Network tab revealed something better: their internal REST API.

Two endpoints power the whole thing:

List API — returns paginated job listings (~60 per page)
Details API — returns complete job info (found buried in Angular’s bundle)

No authentication, lenient rate limits, structured JSON. Much cleaner than parsing brittle CSS selectors.

Performance

~110 pages of listings in ~28 seconds
~5,500 job details in ~8 minutes (batched, 20 concurrent)
Total: ~10 minutes for the complete dataset

The bits I cared about

Dynamic CSV columns — automatically detects new API fields and adds them. Future-proof.

Batch processing — processes jobs 20 at a time with connection pooling. Doesn’t overwhelm the server, doesn’t eat memory.

Retry mechanism — failed requests get queued and retried up to 10 times.

One note: I’m careful about not hammering sites. Concurrency is capped, and this is meant for research/learning, not abuse.

TL;DR

How it works

Performance

The bits I cared about

Links