If you are doing a performance test, it always a good thing to do that using a real dataset. Following are several useful datasets.
Often you can find data in the CSV format, and then parsing and using it is pretty easy.
Often you can find data in the CSV format, and then parsing and using it is pretty easy.
- DLPB catalog - http://kdl.cs.umass.edu/data/dblp/dblp-info.html - this is data about publications. About 900MB raw size.
- Google Fusion tables, http://www.google.com/fusiontables/Home
- this has several useful datasets as CSV. - Federal reserve economic data - http://research.stlouisfed.org/fred2/
- Amazon public datasets - aws.amazon.com/publicdatasets/
There are lot more. Following are some of them. If anyone knows list giving a sizes of datasets and nature of datasets, please let me know.
- http://dvn.iq.harvard.edu/dvn/dv/cid
- http://www.thejanuarist.com/9-fascinating-datasets-available-online-for-free/
- http://bios.dfg.ca.gov/dataset_index.asp
- http://www.uic.edu/orgs/rin/dataset.html
- http://www.nas.nasa.gov/Resources/datasets.html
- http://www.datawrangling.com/some-datasets-available-on-the-web
- http://news.ycombinator.com/item?id=2165497
- https://bitly.com/bundles/hmason/1
- http://www.gutenberg.org/wiki/Gutenberg:Feeds
- http://www.quora.com/Data/Where-can-I-get-large-datasets-open-to-the-public
- http://getthedata.org/
- http://news.ycombinator.com/item?id=2165497
No comments:
Post a Comment