Job description
Job Title: Delta Lake & Unity Catalog Developer
Location: New York City Metropolitan Area (On-site)
Reports To: Chief Executive Officer
Company Overview:
Scalata is a pioneering AI-driven finance automation platform headquartered in the vibrant New York City Metropolitan Area. Our vision is to democratize investment data and empower sophisticated credit decision-making through the transformative power of generative AI. We specialize in automating the entire credit lifecycle – from complex data ingestion and analysis to final decisioning – specifically for the demanding needs of the credit industry. At Scalata, we are committed to responsible AI development, personalization at scale, and granting businesses greater autonomy in how they leverage financial data. We are transforming data handling and decision-making processes within finance.
Core Skills & Technologies
· Delta Lake (Open Source): Building and optimizing transactionally consistent, versioned tables on data lakes for scalable ETL and analytics; experience with Apache Spark, batch and streaming data pipelines; implementing time travel and schema evolution.
· Unity Catalog (Open Governance Model): Configuring access control, auditing, lineage, quality monitoring, and data discovery using cataloging tools (open-source Unity Catalog, Presto/Trino metastore, or Iceberg catalogs); applying standard SQL syntax for privilege management.
· Data Governance: Designing hierarchical privileges for catalogs, schemas, and tables; using ANSI SQL commands to grant/revoke access; maintaining audit trails and documenting data asset lineage.
· Cloud & Storage Platforms: Integrating Delta Lake and cataloging with AWS S3, Azure Data Lake, Google Cloud Storage, or on-prem Hadoop/Spark environments.
· Programming & Orchestration: Building data pipelines with Python, Scala, Rust, or Go; experience with tools such as Airflow, dbt, and CI/CD for ETL orchestration and testing.
Typical Responsibilities
· Architect, implement, and maintain transactionally consistent Delta tables for analytics and ML workloads.
· Establish and manage object-level data governance, permissions, and data lineage using open-source Unity Catalog or similar solutions (e.g., Iceberg catalogs).
· Design robust schemas and catalogs for organizational, project, and team-based data isolation.
· Automate batch and streaming ETL workflows; ensure schema compatibility and manage versioning.
· Integrate analytics platforms (Presto/Trino, Spark SQL) with open cataloging and indexing systems for unified data discovery.
· Advise on best practices for hierarchical privilege models and access auditing.
Example Skills List
· Apache Spark (PySpark/Scala), Delta Kernel
· Open-source Unity Catalog, Presto/Trino/Hive Metastore, Apache Iceberg
· SQL for data definition and permission management
· Data pipeline orchestration (Airflow, dbt)
· Cloud data lake integration (S3, ADLS, GCS)
· Data governance and lineage tracking
Professional Summary Example
Data Engineering professional with expertise in open-source Delta Lake and catalog technologies, specializing in building reliable, governed data lake architectures and enabling secure, discoverable, and compliant enterprise analytics in cloud or hybrid environments
About Scalata:
Join Scalata and be part of a dynamic team that is reshaping the future of finance through artificial intelligence. We are driven by our mission to democratize investment data and empower better decision-making across the credit industry. Our culture fosters innovation, collaboration, and a commitment to responsible technology development. We value autonomy, personalization, and strive to create impactful solutions for our clients. If you are passionate about AI, finance, and driving growth in a cutting-edge environment, Scalata is the place for you.
Equal Opportunity Employer:
Scalata is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran, or disability status.