SEDE Dataset | Papers With Code

SEDE Dataset | Papers With Code

SEDE is a dataset comprised of 12,023 complex and diverse SQL queries and their natural language titles and descriptions, written by real users of the Stack Exchange Data Explorer out of a natural interaction. These pairs contain a variety of real-world challenges which were rarely reflected so far in any other semantic parsing dataset. The goal of this dataset is to take a significant step towards evaluation of Text-to-SQL models in a real-world setting. Compared to other Text-to-SQL datasets, SEDE contains at least 10 times more SQL queries templates (queries after canonization and anonymization of values) than other datasets, and has the most diverse set of utterances and SQL queries (in terms of 3-grams) out of all single-domain datasets. SEDE introduces real-world challenges, such as under-specification, usage of parameters in queries, dates manipulation and more.

1 mentions: @omarsar0
Keywords: dataset
Date: 2021/06/10 12:18

Referring Tweets

@omarsar0 I would be the happiest machine learning engineer if an ML system can help generate accurate SQL queries based purely on text descriptions. We are not there yet! This paper proposes a new dataset to help improve and evaluate Text-to-SQL models. t.co/6WtDjZAmz2 t.co/ZJZlaqzgnc

Related Entries

Read more NLP简报(ISSUE#4): PyTorch3D, DeepSpeed, Turing-NLG, Question Answering Benchmarks, Hydra, Sparse Neura...
0 users, 1 mentions 2020/02/18 14:21
Read more How Coca-Cola Uses AI For Its Loyalty Campaigns - The Click Reader
0 users, 1 mentions 2020/02/20 04:22
Read more NLP Newsletter: NLP Paper Summaries, Learning to Simulate, Transformers Notebooks, Med7, Measuring C...
0 users, 1 mentions 2020/03/16 12:01
Read more [2004.01694v1] A Set of Recommendations for Assessing Human-Machine Parity in Language Translationco...
0 users, 1 mentions 2020/04/07 17:21
Read more [2006.16176] Natural Backdoor Attack on Text Dataopen searchopen navigation menucontact arXivsubscri...
0 users, 1 mentions 2020/06/30 14:21
Read more GitHub - dair-ai/keep-learning-ml: A club to keep learning about ML
0 users, 1 mentions 2020/08/04 20:21

ML-Newsについて

ML-Newsは機械学習に関するニュースサイトです。機械学習に関する論文ブログライブラリコンペティション発表資料勉強会などの最新の情報にアクセスできます。

機械学習を応用した自然言語処理、画像認識、情報検索などの分野の情報や機械学習で必要になるデータ基盤やMLOpsの話題もカバーしています。
安定したサイト運営のためにGitHub sponsorを募集しています。

お知らせ