May Webinar Recap: An Introduction to DataChat

May Webinar Recap: An Introduction to DataChat

May Webinar Recap: An Introduction to DataChat

We recently hosted a webinar to show how DataChat’s Conversational Intelligence is transforming the way organizations leverage AI and BI to make data-driven decisions. In this 30 minute video,Jignesh Patel, co-founder and CEO, and Danny Thompson, Executive Vice President of Sales, gave an introduction to our all-in-one, state-of-the-art analytics platform, discussed the benefits of using our Guided English Language to create no-code data science pipelines, and gave examples of success stories from our customers.

Some common questions from the webinar included:

Is DataChat a SaaS platform? What cloud vendors do you support?

Yes, DataChat is an analytics SaaS platform and can run in any major cloud provider, including Amazon Web Services (AWS), Google Cloud Provider (GCP), or Azure. Also, we allow customers to deploy in their private cloud to address any data privacy concerns.

What data sources can DataChat support (files, database, etc)?

DataChat supports a number of data sources including common flat files (such as CSV, TSV, and Excel files) and many major database systems, including Snowflake, BigQuery, Postgresql, MySQL, SQL Server, and more.

Can a college student get access to DataChat?

Send an email to We are spinning up a new program to provide FREE access to the DataChat platform for a limited amount of time specifically for college and university students. Also, we are launching DataChat University to help you learn the platform at your own pace. Lastly, we provide very rich documentatio non our website that is available to our customers.

We’re always excited to share stories about how DataChat can power analytics for our customers. If you have a specific use case you would like to see demonstrated, please let us know at You can also follow us on LinkedIn and Twitter for more updates and announcements.

Introduction to DataChat: A Q&A with Co-Founder Rogers Jeffrey Leo John

We asked DataChat’s co-founder, Rogers Jeffrey Leo John, to tell us about the origins of the company. What was missing from the marketplace? How has DataChat pushed the envelope?

Q: How did DataChat start?

The idea formed after Jignesh [Patel, our CEO and co-founder] was a visiting scientist at Pivotal Labs. He observed their data teams and the problems they were trying to solve. He noticed that most of the problems (and their solutions) followed similar patterns: when training a model, they ended up in a loop of loading the Python package, selecting features, training the model, then analyzing the results. This was followed by tweaking the features and retraining the model over again.

With that observation in mind, we wrote our first paper. In that paper, we suggested an early prototype of Ava, our Conversational Intelligence assistant, that could abstract the model training loop into a Python template.

Q: How did you improve the model training process?

We realized that, by leveraging controlled natural language (CNL), we could abstract away the programming languages (Python, R, SQL, etc.) from the user in favor of a subset of English. That was the genesis of DataChat’s Guided English Language© (GEL), which was inspired by the “language” used by aviators, such as the NATO phonetic alphabet. GEL allows the user to build data science workflows without needing to know Python, R, SQL, or any other traditional data science tool.

We spun our research out into DataChat and have been growing and evolving ever since.

Q: How has Ava evolved?

While developing Ava and GEL to make model training more intuitive, we’ve also expanded GEL to cover a wide array of data science tools and functions, including data ingestion, data wrangling, and visualization, along with machine learning and explainable artificial intelligence. This makes us a truly all-in-one platform that allows more business users to work with their own data to answer their own questions without needing to learn how to code or work with more complicated data science tools.

Q: What are some innovations in DataChat’s design?

One problem we’re solving is the reproducibility gap. A few years ago, the industry didn’t care about reproducibility; they were more concerned with model accuracy and less concerned about how they got there. We baked reproducibility into DataChat from the beginning.

By having conversations with Ava in GEL, we’re actually automating the documentation and commenting pieces of the data science process, too. Our workflows are built in English, which makes it very easy to look back and see exactly what happened and when. This makes it easy to understand the logic behind the pipeline, but also improves governance and transparency across an organization.

Q: As the platform matured, what other problems have you focused on?

Most of the gains in DataChat can be attributed to how we’ve included industry best practices directly in the platform to avoid common pitfalls, such as avoiding including your label in your feature set. This helps build confidence for novice users and lessens the mental load for more experienced users.

We’ve also seen performance improvements when it comes to data wrangling. For example, in one of our test cases, 100 percent of the DataChat users were able to at least attempt every question in a set of data wrangling problems, compared to 73 percent of Python users.

Overall, we think we’ve already pushed the envelope considerably and are continuing to do so as we add more functionality and features to the platform.

Q: Who will use DataChat?

The biggest challenge we’re trying to solve is how we can democratize data science and bring those data science tools to more users. Everybody has data, but not everybody can pay a data scientist to work with it. Or has the time to learn the tools themselves.

Q: How can DataChat improve the data science field?

Obviously, making data science more accessible is a huge win for everybody. Business users feel empowered, giving more time back to their data teams. Rather than chasing tickets, they can spend more time digging deeper into their data to find more novel insights for their organization.

When we were first developing the platform, we ran some user studies with users who had data science knowledge ranging from nothing at all to intermediate Python experience. We found that DataChat improved those user’s time to the first model by 10 times (~2 minutes in DataChat vs. ~20 minutes in Python).

We also saw that DataChat users were able to train more models (an average of six compared to an average of two in Python) and 100 percent of the DataChat users were able to train at least one model (compared to 80 percent of Python users). Models trained in DataChat were also more accurate (with F1 scores in the 0.7-0.8 range compared to 0.3 to 0.8 in Python).

Q: Has the platform met your initial expectations since spinning out from UW-Madison?

Overall, we think we’ve already pushed the envelope considerably and are continuing to do so as we add more functionality and features to the platform. We’ve also made an impact for our customers from day one. They’re more productive, they’re finding insights they couldn’t before, and they’re unlocking tools that were out of their reach before.