Home Cognitive Trek Newsletter #082 – ML in production

Newsletter #082 – ML in production

by info.odysseyx@gmail.com
0 comment 15 views


After a few week hiatus, the newsletter is back! I’ve been pretty busy moving across the country (so excited to be a San Diego resident 😀 ) and taking care of my now 6 month old firstborn (can’t believe it’s 6 months already). But I’m excited to publish issue #082.

I am currently thinking about how I want to develop MLinProduction further. Many new subscribers have asked me if there is an archive of previous issues of the newsletter. So far I have not published the newsletter on the blog. But I am happy to report that new issues will be published on the blog and also sent to subscribers. I would like to publish previous issues on the blog, but that all depends on how much time I have in the near future.

Anyway, I’ll keep sharing new developments and decisions as I get them. If you have any ideas or ways I can improve the newsletter, I’d love to hear them! Send me an email or leave a comment on the blog!


Here’s what I’ve been reading/watching/listening to lately:

  • Using GitHub Actions for MLOps and Data Science – The first post in a multi-part blog series on using GitHub Actions, GitHub’s native event-driven automation system, to execute ML pipeline tasks. When the author comments on a pull request, an action is triggered that performs model training and evaluation. When evaluation is complete, another action comments on the PR with the evaluation metrics, so the data scientist can decide whether to merge the changes or not. I like how this flow generates an auditable history of code changes and the impact on model metrics.

  • Keep your data pipelines healthy with the Great Expectations GitHub Action – The GitHub Actions team has partnered with Great Expectations to create a workflow that can automatically test, document, and profile your data pipelines. For those unfamiliar with the Python library, Great Expectations allows you to specify “unit tests” for datasets via expectations. Expectations can be launched via the built-in profiling tool, and dataset documentation is generated directly from expectations. I’ve been playing around with the library a bit and look forward to summarizing my findings in an upcoming blog post.

  • Emerging Architectures for Modern Data Infrastructure – Data infrastructure serves two high-level purposes: to help business leaders make better decisions through the use of data (analytics use cases) and to build data intelligence into customer-facing applications, including through machine learning (operational use cases). This a16z blog post provides an overview of 3 common blueprints used for 1) modern business intelligence, 2) multimodal data processing, and 3) AI/ML that rely on modern cloud data infrastructure, both vendor and open source. According to the authors, these blueprints were synthesized from conversations with :hundreds of founders, enterprise data leaders, and other experts – including interviewing more than 20 professionals about their current data stacks”.

  • AWS Data Wrangler – An AWS Professional Service open source python initiative that extends pandas to AWS by connecting DataFrames and AWS data services like Redshift, Glue, Athena, EMR, and more. Check out the super simple tutorials to read/write DataFrames directly to S3 as flat files or Parquet files. The library also exposes methods to easily crawl your files and generate meta tables.

  • How to deploy machine learning models into production – This post from the StackOverflow blog outlines three key areas to consider when deploying models to production: data storage and querying, frameworks and tooling, and feedback and iteration. I would take the second half of the post with a grain of salt (the article seems too salesy to me with its emphasis on GCP and Tensorflow Extended), but I agree with the author’s points about thinking about what a deployed system should look like for starting a project. Thanks to (several) subscribers for putting this article on my radar!

That’s it for this week. If you have any ideas, I’d love to hear them in the comments below!

You may also like

Leave a Comment

Our Company

Welcome to OdysseyX, your one-stop destination for the latest news and opportunities across various domains.

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

Laest News

@2024 – All Right Reserved. Designed and Developed by OdysseyX