Google
 
Discovered Pages

 » Reuters Corpora @ NIST http://trec.nist.gov/data/reuters/reuters.html
 » Cobuild Concordance and Collocations Sampler http://www.collins.co.uk/Corpus/CorpusSearch.aspx
 » [bnc] British National Corpus http://www.natcorp.ox.ac.uk/
 » Restore-Habeas.org | Restoring the Constitution Act of 2007 http://restore-habeas.org/whip/total.php
Discover From Related Topics
 aol  blog  business  communication  corpora  data  dataset  dialogue  dictionary  email  english  enron  language  linguistics  nlp  privacy  reference  research  search  spam  statistics  tools  translation  visualization  words

Discover From This Topic & Page:  [] Reuters Corpora @ NIST http://trec.nist.gov/data/reuters/reuters.html
 [] Trampoline Enron Explorer » http://enron.trampolinesystems.com/ (visualization enron corruption business)
 [] Merriam-Webster's Open Dictionary http://www3.merriam-webster.com/opendictionary/ (words reference english dictionary)
 [] Reuters Corpora @ NIST http://trec.nist.gov/data/reuters/reuters.html (corpus clustering naturallanguageprocessing dataset)
 [] AOL search data mirrors http://www.gregsadetsky.com/aol-data/ (search privacy google aol)
 [] AOL Search Logs http://data.aolsearchlogs.com/search/index.cgi (search privacy aol logs)
 [] Statistical NLP / corpus-based computational linguistics resources http://www-nlp.stanford.edu/links/statnlp.html (nlp statistics linguistics tools)
 [] Statistical NLP / corpus-based computational linguistics resources http://nlp.stanford.edu/links/statnlp.html (linguistics research corpora nlp)
 [] Humanities Text Initiative http://www.hti.umich.edu/ (literature books research reference)
 [] Similar Diversity - by Andreas Koller and Philipp Steinweber http://www.similardiversity.net/index.php (visualization design processing graphics)
 [] NLP Research Links, Vlado Keselj http://users.cs.dal.ca/~vlado/nlp/ (nlp research links linguistics)
 [] The Institute for Language, Speech and Hearing http://www.dcs.shef.ac.uk/research/ilash/Moby/ (english nlp reference language)
 [] Westbury Lab Web Site: Usenet Corpus Download http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html (research corpus linguistics text)
 [] Wortschatz - International Portal http://corpora.informatik.uni-leipzig.de/download.html (corpora corpus english french)
 [] JustTheWord http://193.133.140.102/JustTheWord/ (corpus linguistics english tools)
 [] The Biggest Ever BitTorrent Leak: MediaDefender Internal Emails Go Public | TorrentFreak http://torrentfreak.com/mediadefender-emails-leaked-070915/ (copyright torrent email business)
 [] Web as Corpus ToolKit - Home Page http://www.drni.de/wac-tk/ (nlp corpus tools linguistics)
 [] Wordgumbo Main Index http://www.wordgumbo.com/index.htm (linguistics lists dictionary language)
 [] Developing Linguistic Corpora: a Guide to Good Practice http://www.ahds.ac.uk/creating/guides/linguistic-corpora/index.htm (corpus linguistics corpora nlp)
 [] Visuwords: online graphical dictionary http://www.visuwords.com/?word=genre (dictionary visualisation interactive writing)
 [] Email Datasets http://www.cs.cmu.edu/~einat/datasets.html (corpus nlp data enron)
 [] WWW BootCaT http://corpora.fi.muni.cz/bootcat/ (resources google linguistics api)
 [] David Lee's Bookmarks for Corpus-based Linguists http://devoted.to/corpora (linguistics corpus resources links)
 [] Sketch Engine http://www.sketchengine.co.uk/ (linguistics language words nlp)
 [] Blog Authorship Corpus http://www.cs.biu.ac.il/~koppel/BlogCorpus.htm (datamining corpus dataset blog)
 [] phishingcorpus [MyWiki] http://monkey.org/~jose/wiki/doku.php?id=PhishingCorpus (corpus dataset research text)
 [] GALE Overview http://projects.ldc.upenn.edu/gale/overview/ (project speech nlp corpus)
 [] American National Corpus http://americannationalcorpus.org/ (corpus source linguistics anc)
 [] ibot/jbot logs for 2006 http://purl.rikers.org/%23bzflag/ (corpus conversation dataset communication)
 [] Natural Programming http://www.cs.cmu.edu/~marmalade/reports.html (corpus dataset linux software)
 [] dialogue_acts_manual_1.0.pdf (application/pdf Object) http://mmm.idiap.ch/private/ami/annotation/dialogue_acts_manual_1.0.pdf (dialogue corpus linguistics nlp)
 [] HCRC Map Task Corpus XML annotations http://www.hcrc.ed.ac.uk/maptask/ (dialogue research nlp corpus)
 [] UCL Survey of English Usage, UCL http://www.ucl.ac.uk/english-usage/archives/2006report.htm (linguistics research corpus communication)
 [] Corpus del Español [Davies/NEH/BYU] http://www.corpusdelespanol.org/ (corpus language linguistics spanish)
 [] Europarl Parallel Corpus http://www.statmt.org/europarl/ (translation corpus download machine)
 [] Overview of the Acquis Corpus http://wt.jrc.it/lt/Acquis/JRC-Acquis.2.2/doc/README_Acquis-Communautaire-corpus_JRC.html#Statistics (translation corpus europe)
 [] OPUS - an open source parallel corpus http://logos.uio.no/opus/ (corpus linguistics dataset nlp)
 [] AskOxford: Language Facts http://www.askoxford.com/oec/mainpage/oec02/?view=uk (statistics corpus words language)
 [] On Language - Erin McKean - New York Times http://www.nytimes.com/2007/07/29/magazine/29wwln-guest-t.html?_r=2&partner=rssnyt&emc=rss&a ... (english oed oec language)
 [] XCES http://www.xml-ces.org/ (linguistics xml corpus xces)
 [] Dialogue Diversity Corpus http://www-rcf.usc.edu/~billmann/diversity/DDivers-site.htm (dialogue linguistics corpus research)
 [] Bush Aides Helped Respond to Firings, E-Mails Show - washingtonpost.com http://www.washingtonpost.com/wp-dyn/content/article/2007/06/12/AR2007061202090.html?sub=new (email politics corpus pdf)
 [] Gonzales' Hold on Job Grows Uncertain http://www.sfgate.com/cgi-bin/article.cgi?f=/n/a/2007/03/19/national/w141027D85.DTL&hw=Justice&a ... (email politics corpus pdf)
 [] Splog Blog Dataset http://ebiquity.umbc.edu/resource/html/id/212/Splog-Blog-Dataset (corpus spam dataset blog)
 [] Erik Selberg » Blog Archive » Query logs and the AOL controversy http://erik.selberg.org/2006/08/13/query-logs-and-the-aol-controversy/ (search privacy corpus research)
 [] Complaint filed with FTC over AOL data exposure http://arstechnica.com/news.ars/post/20060815-7504.html (business legal privacy corpus)
 [] Major Embarassment for MediaDefender http://www.idm.net.au/story.asp?id=8793 (email business privacy corpus)
 [] AOL chief technology officer resigns: sources | Entertainment | Industry | Reuters.com http://today.reuters.com/news/articlenews.aspx?type=industryNews&storyID=2006-08-21T193427Z_01_W ... (search business privacy corpus)
 [] AG's corpus of news articles http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html (corpus dataset nlp news)
 [] JRC-ACQUIS Multilingual Parallel Corpus V2.2 http://wt.jrc.it/lt/Acquis/ (corpus linguistics research machinetranslation)
 [] Technologies du Langage: Web: Google's missing pages: mystery solved? http://aixtal.blogspot.com/2005/02/web-googles-missing-pages-mystery.html (google search linguistics nlp)
 [] The battle of the spam News - PC Advisor http://www.pcadvisor.co.uk/news/index.cfm?newsid=6469 (enron email spam evaluation)
 [] AskOxford: Oxford English Corpus http://www.askoxford.com/oec/?view=uk (dictionary uk search translation)
 [] Roget's Thesaurus - Electronic Lexical Knowledge Base ELKB http://www.nzdl.org/ELKB/ (linguistics dictionary corpus dataset)
 [] BootCaT Toolkit http://sslmit.unibo.it/~baroni/bootcat.html (nlp corpus tools linguistics)
 [] Home Page for 20 Newsgroups Data Set http://people.csail.mit.edu/jrennie/20Newsgroups/ (corpus dataset clustering email)
 [] MMAX2 Annotation Tool http://www.eml-research.de/english/research/nlp/download/mmax.php (research nlp applications linguistics)
 [] VIB - The Voynich information browser - Start http://voynich.freie-literatur.de/ (linguistics programming code corpus)
 [] Time Magazine Corpus, 1923-2006 http://view.byu.edu/timemag/ (corpus propaganda media magazine)
 [] http://www.lllf.uam.es/~sandoval/UAMTreebank.html http://www.lllf.uam.es/~sandoval/UAMTreebank.html (nlp corpus linguistics corpora)
 [] To us, here and now, it appears thus. Is language huffman coded? « http://vishnuvyas.wordpress.com/2007/05/31/is-language-huffman-coded/ (linguistics enron corpus research)
 [] OPUS - an open source parallel corpus http://omilia.uio.no/opus/ (parallel nlp corpus linguistics)
 [] Reuters-21578 Text Categorization Test Collection http://www.daviddlewis.com/resources/testcollections/reuters21578/ (data learning ai corpus)
 [] Cleaned W3C Subcollections (for TRECENT 2005) http://www.sis.pitt.edu/~daqing/w3c-cleaned.html (search email corpus research)
 [] CISD: TRAINS Dialogue Corpus http://www.cs.rochester.edu/research/cisd/resources/trains.html (dialogue linguistics research corpus)
 [] LDC Catalog http://www.ldc.upenn.edu/Catalog/ (corpus anc)
 [] AHDS Cross-Search Catalogue | CUVPlus http://www.ahds.ac.uk/catalogue/collection.htm?uri=lll-2469-1 (nlp corpus linguistics research)
 [] Europa - Eurostat - Regions - Main characteristics of the NUTS http://ec.europa.eu/comm/eurostat/ramon/nuts/lau_en.html (map gis corpus reference)
 [] EP604: Conversation & Discourse Analysis http://web.utk.edu/~tpaulus/EP604/604syll.htm (linguistics research conversation corpus)
 [] Westbury Lab Web Site http://www.psych.ualberta.ca/~westburylab/ (usenet corpus)
 [] Digital Systems Research Center: Note 1998-014 http://gatekeeper.dec.com/pub/DEC/SRC/technical-notes/abstracts/src-tn-1998-014.html (search nlp corpus dataset)
 [] LDC Catalog http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2005T35 (corpus)
 [] Death By Email: When is email a public record (in NH)? http://www.deathbyemail.com/2007/04/when_is_email_a.html (email legal corpus dataset)
 [] Map Task Publications http://www.hcrc.ed.ac.uk/maptask/maptask-papers.html (dialogue corpus research dataset)
 [] Text corpus - Wikipedia, the free encyclopedia http://en.wikipedia.org/wiki/Text_corpus (linguistics corpus)
 [] American National Corpus Second Release - Restricted End User License http://projects.ldc.upenn.edu/ANC/ANC_SecondRelease_EndUserLicense_Restricted.htm (corpus anc)
 [] Ted Pedersen - Enron Email Corpus by Topic http://www.d.umn.edu/~tpederse/enron.html (research email enron nlp)
 [] CEAS 2008 Spam Challenge - Email Donations http://ceas.klika.eu/ceas/ (email corpus research spam)
 [] Enron Corpus http://arg.vsb.cz/arg/Enron_Corpus/default.aspx (email enron nlp corpus)
 [] Latin vocabulary: High-frequency Latin Words http://www.slu.edu/colleges/AS/languages/classical/latin/tchmat/grammar/vocabulary/hif-ed2.html (latin corpus linguistics words)
 [] Dortmunder Chat-Korpus - http://www.chatkorpus.uni-dortmund.de/ (corpus linguistics research conversation)
 [] Index of /xml http://dblp.uni-trier.de/xml/ (corpus datamining dataset bibliometrics)
 [] LDC Catalog - Topic Annotated Enron Email Data Set http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007T22 (email nlp corpus enron)
 [] Summarization, Generation, Interaction: The Enron Email Corpus http://www.sgi.nu/enron/ (email corpus enron dataset)
 [] Wired News: Science Puts Enron E-Mail to Use http://www.wired.com/news/technology/0,70100-0.html (email enron media corpus)
 [] Umbromancy: Research projects http://umbromancy.blogspot.com/2005/06/research-projects.html (litreview email corpus research)
 [] i6doc.com http://www.i6doc.com/I6Doc/WebObjects/I6Doc5.woa/wa/DocumentDA/document?language=FR&d=1008272 (corpus sms dataset)
 [] T E X T F I L E S http://www.textfiles.com/games/ATARIMAIL/ (email history games programming)
 [] SPAAC -- A Speech Act Annotated Corpus for Dialogue Systems: Pilot Project http://bowland-files.lancs.ac.uk/groups/spaac/SPAAC.htm (email markup corpus speechact)
 [] FERC: Information Released in Enron Investigation http://www.ferc.gov/industries/electric/indus-act/wec/enron/info-release.asp (email corpus enron dataset)
 [] Tagged datasets for named entity recognition tasks http://www.cs.technion.ac.il/~gabr/resources/data/ne_datasets.html (nlp ner corpus dataset)
 [] Web as Corpus at CL 2005 http://sslmit.unibo.it/~baroni/web_as_corpus_cl05.html (nlp corpus linguistics research)
 [] Email Datasets http://www-2.cs.cmu.edu/~einat/datasets.html (email enron nlp corpus)
 [] Coconut Corpus http://www.pitt.edu/~coconut/coconut-corpus.html (dialogue linguistics corpus annotation)
 [] Cyber-Neologoliferation - New York Times http://www.nytimes.com/2006/11/05/magazine/05cyber.html?ei=5070&en=fc45aaa65900cf95&ex=11857 ... (words oed corpus language)
 [] ..::[Sentence Classification]::.. http://www.csse.monash.edu.au/~anthonya/ (research email nlp corpus)
 [] Sketch Engine Corpus Tool http://corpora.sketchengine.co.uk/auth/ (tool corpus)
 [] Web-as-Corpus kool ynitiative (Wacky) http://wacky.sslmit.unibo.it/doku.php (corpus dataset mysql)
 [] README for the NUS SMS Corpus http://www.comp.nus.edu.sg/~rpnlpir/downloads/corpora/smsCorpus/ (sms corpus dataset)
 [] IJCAI 2007 Workshop on Analytics for Noisy Unstructured Text Data http://research.ihost.com/and2007/data.html (research corpus dataset)
 [] Ted Pedersen - Name Discrimination Data / Name Disambiguation Data http://www.d.umn.edu/~tpederse/namedata.html (research nlp corpus resource)