Subject archive for "code," page 8

Code

SHAP and LIME Python Libraries: Part 1 - Great Explainers, with Pros and Cons to Both

This blog post provides a brief technical introduction to the SHAP and LIME Python libraries, followed by code and output to highlight a few pros and cons of each. If interested in a visual walk-through of this post, consider attending the webinar.

By Josh Poduska6 min read

Data Science

Making PySpark Work with spaCy: Overcoming Serialization Errors

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to use spaCy to process text data. Karau is a Developer Advocate at Google, as well as a co-author of "High Performance Spark" and "Learning Spark". She has a repository of her talks, code reviews and code sessions on Twitch and YouTube. She is also working on Distributed Computing 4 Kids.

By Domino8 min read

Data Science

Item Response Theory in R for Survey Analysis

In this guest blog post, Derrick Higgins covers item response theory (IRT) and how data scientists can apply it within a project. As a complement to the guest blog post, there is also a demo within Domino.

By Derrick Higgins9 min read

Benchmark

Benchmarking NVIDIA CUDA 9 and Amazon EC2 P3 Instances Using Fashion MNIST

In this post, Josh Poduska, Chief Data Scientist at Domino Data Lab, writes about benchmarking NVIDIA CUDA 9 and Amazon EC2 P3 Instances Using Fashion MNIST. If interested in additional insight from Poduska, he will also be presenting "Managing Data Science in the Enterprise" at Strata New York 2018.

By Josh Poduska8 min read

Data Science

Three Simple Worrying Stats Problems

In this guest post, Sean Owen, writes about three data situations that provide ambiguous results and how causation helps clarifies the interpretation of data. A version of this post previously appeared on Quora. Domino would like to extend special thanks to Sean for updating the Quora post for our blog.

By Domino13 min read

Data Science

On the Importance of Community-Led Open Source

Wes McKinney, Director of Ursa Labs and creator of pandas project, presented the keynote, "Advancing Data Science Through Open Source" at Rev. McKinney's keynote covered open source's symbiotic relationship with data science and the importance of community-led open source. This blog post includes distilled highlights, the full video, and transcript of the keynote.

By Domino33 min read

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.

*

By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.