Talk the Talk: Data Science Jargon for the Non-Data Scientist

Every executive I’ve worked with has sat in a meeting where someone used terms like “model,” “feature engineering,” or “AI-powered” and nodded politely while filing a quiet note to look it up later.

This is a vocabulary problem, and vocabulary problems have vocabulary solutions.

The terms you actually hear

Artificial Intelligence is the capacity for machines to generate inferences from input without explicit human instruction. The definition is famously slippery — “AI is whatever hasn’t been done yet” has been attributed to various researchers, and it’s not entirely wrong. In practice, when someone says AI in a business meeting, they usually mean machine learning.

Machine Learning is a set of algorithms and statistical models that enable computers to identify patterns in data and make predictions without being explicitly programmed for each decision. Classical statistics (logistic regression, linear models) lives here alongside newer approaches (random forests, neural networks). The common thread: the system learns from examples rather than rules.

Model is the most overloaded word in data science. It refers to the algorithm, the fitted relationship, the deployed system, and sometimes the predictions it produces — all depending on context. When someone says “the model says,” they mean a trained system is producing an output based on input data.

Feature is an input variable to a model — a predictor, a column of data. Feature engineering — the process of creating and transforming these inputs — is often what separates a good model from a great one. The outcome you’re trying to predict is called the label or target.

Data Quality is the degree to which your data is accurate, consistent, complete, and fit for the purpose you’re using it for. This is consistently underestimated. Most AI initiatives that fail do so because of data quality problems, not model problems.

What to listen for in meetings

When someone proposes an AI initiative, the questions that matter:

What specific decision will the model make or inform?
What data will train it, and how was that data collected?
What does good performance look like — and how will you measure it?
Who maintains it after it’s deployed?

These questions don’t require technical expertise. They require knowing what to ask.

This post is adapted from a series originally published at Data Science at Work.