We are Progress (Nasdaq: PRGS) – an experienced, trusted provider of products designed with customers in mind so they can develop the applications they need, deploy where and how they want, and manage it all safely and securely.
We’re proud to have a diverse, global team where we value the individual and enrich our culture by considering varied perspectives because we believe people power progress. Join us as a Senior Data Engineer and help us do what we do best: propelling business forward.
We are seeking an experienced and driven Data Engineer to join our AI First team, part of Web Components and Tools division. In this role, you will turn large volumes of unstructured content – documentation, knowledge-base articles, API references, etc. into high-quality, AI-ready data that powers our products.
In this role, you will:
Design, develop, and maintain scalable data pipelines for collecting, ingesting, transforming and monitoring text data from multiple sources.
Process and analyze large datasets to support AI model embedding and fine-tuning.
Validate, clean and categorize data to ensure quality, security and usability; surface anomalies and gaps through automated checks.
Prepare data for RAG (Retrieval-Augmented Generation) workflows by splitting and chunking documents, managing metadata, and ensuring smooth integration with vector stores.
Build and optimize database pipeline architectures for performance and reliability
Implement data governance and security best practices.
Collaborate with cross-functional teams to deliver data solutions aligned with business goals.
Contribute to the creation of ground-truth datasets and evaluation harnesses for future Gen-AI features.
Stay current with emerging AI tools and trends to drive innovation.
Your background:
Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field.
3–5 years of experience as a Data Engineer or Data Scientist, preferably in a cloud environment.
Proficiency with big data systems and tools (e.g., Apache Spark, Kafka, VectorDB).
Knowledge of data science workflows and libraries
Strong SQL and Python skills; familiarity with text-processing/ML libraries (pandas, PySpark, Hugging Face).
Hands-on experience with cloud platform, especially MS Azure.
Exposure to vector search technologies (e.g., pgvector, Pinecone, Milvus, Azure AI Search).
Solid understanding of machine learning fundamentals
Strong analytical and problem-solving skills.
Effective communication and collaboration abilities.
If this sounds like you and fits your experience and career goals, we’d be happy to chat. What we offer in return is the opportunity to experience a great company culture with wonderful colleagues to learn from and collaborate with and also to enjoy: Compensation
Generous remuneration package
Employee Stock Purchase Plan Enrollment
Family, and Health
30 days paid annual vacation
An extra day off for your birthday
2 additional days off for volunteering
Premium healthcare and dental care coverage
Additional pension insurance
Well-equipped gym on-site with CrossFit equipment and a climbing wall
Co-funded Multisport card
Daycare Center for your little ones onsite
Flexible working hours and the opportunity to work from home.
Free underground parking with a designated space for bikes and electric scooters
By enabling them, you help us to develop and deliver better services in the way that's most convenient for you. For information and settings, see our Cookie Policy.