Spidering Hacks:
The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your da...
Gespeichert in:
Beteiligte Personen: | , |
---|---|
Körperschaften: | , |
Format: | Elektronisch E-Book |
Sprache: | Englisch |
Veröffentlicht: |
[Place of publication not identified]
O'Reilly Media, Incorporated
2003
|
Ausgabe: | 1st edition. |
Schriftenreihe: | Hacks series
|
Schlagwörter: | |
Links: | https://learning.oreilly.com/library/view/-/0596005776/?ar |
Zusammenfassung: | The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you. Spidering Hacks takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you. Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks , you'll be able to: Aggregate and associate data from disparate locations, then store and manipulate the data as you like Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites Integrate third-party data into your own applications or web sites Make your own site easier to scrape and more usable to others Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data. |
Beschreibung: | Online resource; Title from title page (viewed October 28, 2003) |
Umfang: | 1 Online-Ressource (432 Seiten). |
ISBN: | 9780596005771 0596005776 |
Internformat
MARC
LEADER | 00000cam a22000002 4500 | ||
---|---|---|---|
001 | ZDB-30-ORH-051701014 | ||
003 | DE-627-1 | ||
005 | 20241129125342.0 | ||
007 | cr uuu---uuuuu | ||
008 | 200417s2003 xx |||||o 00| ||eng c | ||
020 | |a 9780596005771 |c paperback |9 978-0-596-00577-1 | ||
020 | |a 0596005776 |c paperback |9 0-596-00577-6 | ||
035 | |a (DE-627-1)051701014 | ||
035 | |a (DE-599)KEP051701014 | ||
035 | |a (ORHE)0596005776 | ||
035 | |a (DE-627-1)051701014 | ||
040 | |a DE-627 |b ger |c DE-627 |e rda | ||
041 | |a eng | ||
082 | 0 | |a 006.3 |2 21/eng/20230216 | |
100 | 1 | |a Hemenway, Kevin |e VerfasserIn |4 aut | |
245 | 1 | 0 | |a Spidering Hacks |c Kevin Hemenway & Tara Calishain |
250 | |a 1st edition. | ||
264 | 1 | |a [Place of publication not identified] |b O'Reilly Media, Incorporated |c 2003 | |
300 | |a 1 Online-Ressource (432 Seiten). | ||
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
490 | 0 | |a Hacks series | |
500 | |a Online resource; Title from title page (viewed October 28, 2003) | ||
520 | |a The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you. Spidering Hacks takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you. Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks , you'll be able to: Aggregate and associate data from disparate locations, then store and manipulate the data as you like Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites Integrate third-party data into your own applications or web sites Make your own site easier to scrape and more usable to others Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data. | ||
650 | 0 | |a Computer software |x Reusability | |
650 | 0 | |a Web search engines | |
650 | 0 | |a Mobile agents (Computer software) | |
650 | 0 | |a Internet searching | |
650 | 0 | |a Internet programming | |
650 | 4 | |a Logiciels ; Réutilisation | |
650 | 4 | |a Moteurs de recherche sur Internet | |
650 | 4 | |a Agents mobiles (Logiciels) | |
650 | 4 | |a Recherche sur Internet | |
650 | 4 | |a Programmation Internet | |
650 | 4 | |a Web search engines | |
650 | 4 | |a Mobile agents (Computer software) | |
650 | 4 | |a Internet searching | |
650 | 4 | |a Internet programming | |
650 | 4 | |a Computer software ; Reusability | |
700 | 1 | |a Calishain, Tara |e VerfasserIn |4 aut | |
710 | 2 | |a O'Reilly for Higher Education (Firm), |e MitwirkendeR |4 ctb | |
710 | 2 | |a Safari, an O'Reilly Media Company. |e MitwirkendeR |4 ctb | |
776 | 1 | |z 0596005776 | |
776 | 0 | 8 | |i Erscheint auch als |n Druck-Ausgabe |z 0596005776 |
966 | 4 | 0 | |l DE-91 |p ZDB-30-ORH |q TUM_PDA_ORH |u https://learning.oreilly.com/library/view/-/0596005776/?ar |m X:ORHE |x Aggregator |z lizenzpflichtig |3 Volltext |
912 | 1 | |a ZDB-30-ORH |d 20241129 | |
951 | |a BO | ||
912 | |a ZDB-30-ORH | ||
049 | |a DE-91 |
Datensatz im Suchindex
DE-BY-TUM_katkey | ZDB-30-ORH-051701014 |
---|---|
_version_ | 1821494922769858560 |
adam_text | |
any_adam_object | |
author | Hemenway, Kevin Calishain, Tara |
author_corporate | O'Reilly for Higher Education (Firm) Safari, an O'Reilly Media Company |
author_corporate_role | ctb ctb |
author_facet | Hemenway, Kevin Calishain, Tara O'Reilly for Higher Education (Firm) Safari, an O'Reilly Media Company |
author_role | aut aut |
author_sort | Hemenway, Kevin |
author_variant | k h kh t c tc |
building | Verbundindex |
bvnumber | localTUM |
collection | ZDB-30-ORH |
ctrlnum | (DE-627-1)051701014 (DE-599)KEP051701014 (ORHE)0596005776 |
dewey-full | 006.3 |
dewey-hundreds | 000 - Computer science, information, general works |
dewey-ones | 006 - Special computer methods |
dewey-raw | 006.3 |
dewey-search | 006.3 |
dewey-sort | 16.3 |
dewey-tens | 000 - Computer science, information, general works |
discipline | Informatik |
edition | 1st edition. |
format | Electronic eBook |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>04489cam a22006012 4500</leader><controlfield tag="001">ZDB-30-ORH-051701014</controlfield><controlfield tag="003">DE-627-1</controlfield><controlfield tag="005">20241129125342.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">200417s2003 xx |||||o 00| ||eng c</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9780596005771</subfield><subfield code="c">paperback</subfield><subfield code="9">978-0-596-00577-1</subfield></datafield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">0596005776</subfield><subfield code="c">paperback</subfield><subfield code="9">0-596-00577-6</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)051701014</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)KEP051701014</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ORHE)0596005776</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627-1)051701014</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rda</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2=" "><subfield code="a">006.3</subfield><subfield code="2">21/eng/20230216</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Hemenway, Kevin</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Spidering Hacks</subfield><subfield code="c">Kevin Hemenway & Tara Calishain</subfield></datafield><datafield tag="250" ind1=" " ind2=" "><subfield code="a">1st edition.</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">[Place of publication not identified]</subfield><subfield code="b">O'Reilly Media, Incorporated</subfield><subfield code="c">2003</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">1 Online-Ressource (432 Seiten).</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="490" ind1="0" ind2=" "><subfield code="a">Hacks series</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Online resource; Title from title page (viewed October 28, 2003)</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you. Spidering Hacks takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you. Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks , you'll be able to: Aggregate and associate data from disparate locations, then store and manipulate the data as you like Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites Integrate third-party data into your own applications or web sites Make your own site easier to scrape and more usable to others Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data.</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Computer software</subfield><subfield code="x">Reusability</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Web search engines</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Mobile agents (Computer software)</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Internet searching</subfield></datafield><datafield tag="650" ind1=" " ind2="0"><subfield code="a">Internet programming</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Logiciels ; Réutilisation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Moteurs de recherche sur Internet</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Agents mobiles (Logiciels)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Recherche sur Internet</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Programmation Internet</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Web search engines</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Mobile agents (Computer software)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Internet searching</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Internet programming</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer software ; Reusability</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Calishain, Tara</subfield><subfield code="e">VerfasserIn</subfield><subfield code="4">aut</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">O'Reilly for Higher Education (Firm),</subfield><subfield code="e">MitwirkendeR</subfield><subfield code="4">ctb</subfield></datafield><datafield tag="710" ind1="2" ind2=" "><subfield code="a">Safari, an O'Reilly Media Company.</subfield><subfield code="e">MitwirkendeR</subfield><subfield code="4">ctb</subfield></datafield><datafield tag="776" ind1="1" ind2=" "><subfield code="z">0596005776</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Druck-Ausgabe</subfield><subfield code="z">0596005776</subfield></datafield><datafield tag="966" ind1="4" ind2="0"><subfield code="l">DE-91</subfield><subfield code="p">ZDB-30-ORH</subfield><subfield code="q">TUM_PDA_ORH</subfield><subfield code="u">https://learning.oreilly.com/library/view/-/0596005776/?ar</subfield><subfield code="m">X:ORHE</subfield><subfield code="x">Aggregator</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1="1" ind2=" "><subfield code="a">ZDB-30-ORH</subfield><subfield code="d">20241129</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">BO</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">ZDB-30-ORH</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-91</subfield></datafield></record></collection> |
id | ZDB-30-ORH-051701014 |
illustrated | Not Illustrated |
indexdate | 2025-01-17T11:22:05Z |
institution | BVB |
isbn | 9780596005771 0596005776 |
language | English |
open_access_boolean | |
owner | DE-91 DE-BY-TUM |
owner_facet | DE-91 DE-BY-TUM |
physical | 1 Online-Ressource (432 Seiten). |
psigel | ZDB-30-ORH TUM_PDA_ORH ZDB-30-ORH |
publishDate | 2003 |
publishDateSearch | 2003 |
publishDateSort | 2003 |
publisher | O'Reilly Media, Incorporated |
record_format | marc |
series2 | Hacks series |
spelling | Hemenway, Kevin VerfasserIn aut Spidering Hacks Kevin Hemenway & Tara Calishain 1st edition. [Place of publication not identified] O'Reilly Media, Incorporated 2003 1 Online-Ressource (432 Seiten). Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier Hacks series Online resource; Title from title page (viewed October 28, 2003) The Internet, with its profusion of information, has made us hungry for ever more, ever better data. Out of necessity, many of us have become pretty adept with search engine queries, but there are times when even the most powerful search engines aren't enough. If you've ever wanted your data in a different form than it's presented, or wanted to collect data from several sites and see it side-by-side without the constraints of a browser, then Spidering Hacks is for you. Spidering Hacks takes you to the next level in Internet data retrieval--beyond search engines--by showing you how to create spiders and bots to retrieve information from your favorite sites and data sources. You'll no longer feel constrained by the way host sites think you want to see their data presented--you'll learn how to scrape and repurpose raw data so you can view in a way that's meaningful to you. Written for developers, researchers, technical assistants, librarians, and power users, Spidering Hacks provides expert tips on spidering and scraping methodologies. You'll begin with a crash course in spidering concepts, tools (Perl, LWP, out-of-the-box utilities), and ethics (how to know when you've gone too far: what's acceptable and unacceptable). Next, you'll collect media files and data from databases. Then you'll learn how to interpret and understand the data, repurpose it for use in other applications, and even build authorized interfaces to integrate the data into your own content. By the time you finish Spidering Hacks , you'll be able to: Aggregate and associate data from disparate locations, then store and manipulate the data as you like Gain a competitive edge in business by knowing when competitors' products are on sale, and comparing sales ranks and product placement on e-commerce sites Integrate third-party data into your own applications or web sites Make your own site easier to scrape and more usable to others Keep up-to-date with your favorite comics strips, news stories, stock tips, and more without visiting the site every day Like the other books in O'Reilly's popular Hacks series, Spidering Hacks brings you 100 industrial-strength tips and tools from the experts to help you master this technology. If you're interested in data retrieval of any type, this book provides a wealth of data for finding a wealth of data. Computer software Reusability Web search engines Mobile agents (Computer software) Internet searching Internet programming Logiciels ; Réutilisation Moteurs de recherche sur Internet Agents mobiles (Logiciels) Recherche sur Internet Programmation Internet Computer software ; Reusability Calishain, Tara VerfasserIn aut O'Reilly for Higher Education (Firm), MitwirkendeR ctb Safari, an O'Reilly Media Company. MitwirkendeR ctb 0596005776 Erscheint auch als Druck-Ausgabe 0596005776 |
spellingShingle | Hemenway, Kevin Calishain, Tara Spidering Hacks Computer software Reusability Web search engines Mobile agents (Computer software) Internet searching Internet programming Logiciels ; Réutilisation Moteurs de recherche sur Internet Agents mobiles (Logiciels) Recherche sur Internet Programmation Internet Computer software ; Reusability |
title | Spidering Hacks |
title_auth | Spidering Hacks |
title_exact_search | Spidering Hacks |
title_full | Spidering Hacks Kevin Hemenway & Tara Calishain |
title_fullStr | Spidering Hacks Kevin Hemenway & Tara Calishain |
title_full_unstemmed | Spidering Hacks Kevin Hemenway & Tara Calishain |
title_short | Spidering Hacks |
title_sort | spidering hacks |
topic | Computer software Reusability Web search engines Mobile agents (Computer software) Internet searching Internet programming Logiciels ; Réutilisation Moteurs de recherche sur Internet Agents mobiles (Logiciels) Recherche sur Internet Programmation Internet Computer software ; Reusability |
topic_facet | Computer software Reusability Web search engines Mobile agents (Computer software) Internet searching Internet programming Logiciels ; Réutilisation Moteurs de recherche sur Internet Agents mobiles (Logiciels) Recherche sur Internet Programmation Internet Computer software ; Reusability |
work_keys_str_mv | AT hemenwaykevin spideringhacks AT calishaintara spideringhacks AT oreillyforhighereducationfirm spideringhacks AT safarianoreillymediacompany spideringhacks |