JOB TITLE: Senior Data Engineer
DEPARTMENT: Software Development
REPORTS TO: Project Manager
PURPOSE:
The Data Engineer will play a critical role in developing and implementing data pipelines to support data-driven decision-making for clients. This role is integral to simplifying and streamlining access to high-value datasets through Liberator, an innovative data fabric API, empowering organizations to harness both historical and real-time streaming data effectively. The successful candidate will be responsible for onboarding complex datasets that deliver production-ready solutions, ultimately enabling clients to gain competitive advantages in the financial services industry.
KEY RESPONSIBILITES:
- Design, develop, and maintain performant data pipelines (ETL/ELT) to integrate data from diverse sources such as databases, APIs, and files into cloud data warehouses (e.g., Wasabi, Snowflake).
- Collaborate closely with cross-functional teams to analyze requirements, design, and implement data solutions that meet clients' business needs.
- Write clean, efficient, and maintainable code in Python, following industry best practices and coding standards.
- Perform debugging and troubleshoot data pipelines to ensure data quality, performance, and reliability.
- Implement data validation, cleansing, and standardization processes to uphold data integrity and quality.
- Design and implement data models aligned with business requirements, supporting efficient data analysis and reporting.
- Utilize scheduling technologies (e.g., Airflow, Prefect) to automate data workflows.
- Participate in code reviews, sharing feedback to drive continuous improvement and collaboration across the team.
- Maintain awareness of industry trends and advancements in data engineering and related technologies to bring innovative solutions to the team.
QUALIFICATIONS, SKILLS AND EXPEREINCE:
- At least Bachelor’s Degree in Computer Science, Engineering, or a related field (or equivalent experience).
- 5+ years of experience as a Data Engineer, working on complex data processing and analytics tasks.
- Proficiency in Python and Bash scripting languages.
- Strong understanding of Linux-based systems and containerization technologies (e.g., Docker).
- Extensive experience in database design, development, and SQL (PostgreSQL Timeseries knowledge is a plus).
- Proficiency in building and managing ETL/ELT pipelines for various data sources.
- Experience with cloud data warehouse/storage solutions like Wasabi and Snowflake.
- Familiarity with data modeling and quality processes.
- Knowledge of scheduling tools (e.g., Airflow, Prefect) and container orchestration (Kubernetes experience is a plus).
- Experience with Machine Learning and Artificial Intelligence is advantageous.
- Strong problem-solving and analytical skills.
- Excellent interpersonal and communication skills to facilitate effective collaboration.
- Ability to work independently, prioritize tasks, and manage time effectively.