Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Solution Architect Web Scraping & Data Engineering @ Merit Data

Home > IT & Information Security - Other

 Solution Architect Web Scraping & Data Engineering

Job Description


Roles & Responsibilities Solution Architect (Web Scraping & Data Engineering)


1. Pre-Sales, Proposal & Estimation (Critical Responsibility)

  • Lead technical solutioning for RFPs/RFIs related to scraping projects
  • Conduct client discovery sessions to understand data sources, SLAs, volume, and compliance needs
  • Perform target site feasibility analysis:
    • Anti-bot complexity
    • Dynamic content & login barriers
    • Geo-restrictions and rate limits
  • Own end-to-end effort estimation:
    • Development effort
    • Infrastructure sizing
    • Proxy & CAPTCHA cost estimation
    • Maintenance and risk buffer
  • Create solution architecture diagrams, assumptions, and tech stack recommendations
  • Define SLAs, KPIs, and acceptance criteria for proposals
  • Act as technical SPOC in client presentations, defence sessions, and negotiations
  • Maintain and improve estimation frameworks, templates, and historical benchmarks
  • Provide final technical feasibility and risk sign-off for all proposals

2. Web Scraping Architecture & Design (Primary Focus)

  • Design end-to-end scraping systems:
    • Crawl orchestration extraction parsing storage consumption
  • Architect resilient scraping platforms handling:
    • JavaScript-heavy sites
    • CAPTCHA challenges
    • Rate limits & IP blocking
    • Frequent DOM changes
  • Define reference architectures for high-scale, high-frequency scraping use cases
  • Select appropriate frameworks, proxy solutions, and anti-bot strategies
  • Establish data quality standards:
    • Deduplication
    • Validation
    • Schema evolution

3. Data Engineering Architecture (Secondary Focus)

  • Design data pipelines to ingest, transform, and serve scraped data
  • Architect data lakehouse / warehouse solutions
  • Define data modelling strategies (Dimensional, Data Vault, Hybrid)
  • Establish ETL/ELT patterns and orchestration strategies
  • Implement data quality, lineage, and cataloging practices
  • Optimize storage design:
    • Formats, partitioning, indexing for performance & cost

4. Delivery & Technical Leadership

  • Translate business needs into technical designs, sprint plans, and roadmaps
  • Develop key design documents:
    • High-Level Design (HLD)
    • Low-Level Design (LLD)
    • Architecture Decision Records (ADRs)
  • Provide hands-on guidance and code reviews
  • Mentor scraping engineers and data engineers
  • Lead Proof-of-Concept (PoC) implementations for complex use cases

5. Compliance, Risk & Governance

  • Ensure adherence to:
    • GDPR, CCPA, DPDP
    • Robots.txt and copyright laws
  • Define ethical scraping standards and PII handling guidelines
  • Conduct risk assessments for target data sources
  • Recommend mitigation strategies for legal and technical risks

6. Performance, Cost & Operations Management

  • Define and monitor SLAs:
    • Data freshness
    • Completeness
    • Accuracy
  • Design monitoring and alerting mechanisms for pipelines
  • Implement self-healing and fault-tolerant systems
  • Optimize cost across infrastructure, proxies, and storage
  • Ensure high availability and scalability of scraping systems

7. Documentation & Deliverables Ownership

  • Produce proposal-phase deliverables:
    • Technical solution documents
    • Feasibility & risk assessments
    • Estimation models
  • Deliver execution-phase artifacts:
    • HLD/LLD documents
    • ADRs
    • PoCs and reference implementations
  • Maintain:
    • Engineering standards
    • Code review logs
    • Runbooks and monitoring playbooks
  • Lead knowledge transfer and handover sessions

8. Stakeholder & Client Engagement

  • Act as primary technical authority across engagements
  • Communicate with:
    • Clients (including CXO-level)
    • Sales teams
    • Delivery leadership
  • Support technical negotiations and decision-making
  • Ensure alignment between business goals and technical execution

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: IT & Information Security
Role Category: IT & Information Security - Other
Role: IT & Information Security - Other
Employement Type: Full time

Contact Details:

Company: Merit Data
Location(s): Chennai

+ View Contactajax loader


Keyskills:   XPath Pyspark Scrapy BeautifulSoup Scrapy Cluster httpx lineage Crawling Scrapy-Redis ETL / ELT design EMR Requests Frontera Redshift parsel regex Synapse Data validation CSS selectors cataloging lxml Databricks Crawlee Playwright Automation

 Fraud Alert to job seekers!

₹ 16-27.5 Lacs P.A

Similar positions

Solution Architect

  • Zensar
  • 10 - 14 years
  • Noida, Gurugram
  • 7 days ago
₹ Not Disclosed

Urgent opening For Solution Architect (Pure play) in Wipro

  • Wipro
  • 15 - 20 years
  • Hyderabad
  • 1 month ago
₹ Not Disclosed

Presales Solution Architect

  • SHI
  • 4 - 9 years
  • Pune
  • 2 mths ago
₹ 15-25 Lacs P.A.

Solution Architect

  • Tata Consultancy
  • 10 - 20 years
  • Noida, Gurugram
  • 3 mths ago
₹ 20-30 Lacs P.A.

Merit Data

We build DATA + CODE for some of the worlds leading B2B brands. Through pioneering marketing data, product data,customised software and marketing automation, Merit helps brands of all sizes confidently embrace the challenges ofthe future.

Job Listings