Back to All Resources

C4 (Colossal Clean Crawled Corpus)

Massive cleaned web crawl dataset used for T5 and other language models.

Visit Resource