This company tries to solve community moderation with a “data engine for AI”


Dataloop VP of Product Shlomi Avigdor explains how data engines play a part in scaling and deploying AI to solve the content moderation problem in online social experiences.

Keeping discussions safe in online communities is a challenge. That’s already putting it mildly when it comes to more traditional text-based social media sites. When it comes to platforms that use live voice, like online games, the task can become nearly impossible. At least, for humans. Dataloop offers a solution that could help companies use AI to monitor and manage their online environments.

Data engines and AI models

Dataloop isn’t a complete AI product in itself. Rather, the company describes its product as a “data engine for AI”. It provides companies with tools that make training AI models possible, including storage and annotation resources and automation pipelines that make integrating AI into an existing process easier.

“Companies are accumulating data for their industries, and they have trouble storing them while monitoring their context,” Dataloop VP of Product Shlomi Avigdor told THE DECODER. “In between arranging and searching their data through AI development, that’s where Dataloop comes in.”


According to Avigdor, AI alone can’t solve the moderation problem anyway because most words or phrases aren’t problematic in themselves. They’re problematic in context. Dataloop’s tools can work with an AI that is learning to recognize problematic words or phrases to provide an annotated snippet to a human moderator, who can then make the final decision.

“The content that you actually need is a problem in itself,” said Avigdor. “If you have 50,000 audio files in one minute, can you really go through all of that? …Dataloop has that in its platform.”

Where humans come into the loop

Moderation with AI alone might already be possible, but it won’t already be good. We have already learned this from social media companies that use AI content moderation strategies initially but offer a human appeal process. A lot of content that shouldn’t be flagged isn’t, and a lot of content that shouldn’t be flagged is.

Using a data engine fed by and feeding into AI but maintaining the human middleman prevents these issues but also helps to improve the AI. Operators can tell the AI ​​model when its contributions were helpful and when they weren’t.

“What happens in some cases is the public model doesn’t do so well at the beginning, but it gets better and better over time, and you can get to 90 percent accuracy very quickly,” said Avigdor. “Refining that to 99 percent, that’s the difficult part. That’s where the humans come into the loop.”


data loop.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top