About the Company
Pachyderm is an enterprise-grade, open source data science platform that makes explainable, repeatable, and scalable ML/AI a reality. Our platform brings together version control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want. What makes Pachyderm a natural choice for data science teams is that they can iterate quickly and know that everything is tracked and 100% reproducible.
What would data analytics infrastructure (namely Hadoop) look like if we rebuilt it from scratch today? We think it would be containerized, modular, and easy enough for a single person to use while still being scalable enough for a whole company. Tools like Docker and Kubernetes provide the perfect building blocks for us revolutionize data infrastructure!
Pachyderm is “Git for Data Science.” We offer complete version control for data and give your data science team the same first-class development tools as software developers. Pachyderm is ideal for building machine learning pipelines and ETL workflows because we track every model/output directly to the raw input datasets that created it (aka: Provenance).
Since everything in Pachyderm is a container, data scientists can use any languages or libraries they want (e.g. Spark, R, Python, OpenCV, etc) without any additional infrastructure overhead.
At Pachyderm, we're building an open-source, enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. Our system, developed with open source roots, shifts the paradigm of data science workflows by providing reproducibility, data provenance, and opportunity for true collaboration. Pachyderm utilizes modern technologies like Docker and Kubernetes to build an entirely new method of analyzing data. Offered both as an in-house solution as well as hosted-service, Pachyderm brings together version-control for data with the tools to build scalable end-to-end ML/AI pipelines while empowering users to use any language, framework, or tool they want. If you want to learn more about our grand vision, read what has become our “manifesto.” Pachyderm is a rapidly growing, early-stage company funded by the top VC's — Benchmark, Decibel, M12, and YCombinator. Like many modern companies, Pachyderm embraces a “remote-first” approach to growing our team. It gives us a huge advantage in hiring top talent and diverse talent across the country while giving our team members the flexibility to work from anywhere.