{"id":1039,"date":"2024-07-05T14:29:41","date_gmt":"2024-07-05T13:29:41","guid":{"rendered":"https:\/\/emgruppen.com\/?p=1039"},"modified":"2024-06-27T14:35:11","modified_gmt":"2024-06-27T13:35:11","slug":"reddit-to-enhance-web-standards-to-block-automated-data-scraping","status":"publish","type":"post","link":"https:\/\/emgruppen.com\/index.php\/2024\/07\/05\/reddit-to-enhance-web-standards-to-block-automated-data-scraping\/","title":{"rendered":"Reddit to Enhance Web Standards to Block Automated Data Scraping"},"content":{"rendered":"\n<p>Social media entity Reddit (RDDT.N) said on Tuesday it would enhance a web standard utilized by the site to block automated data scraping, as reports have lately emerged that some AI startups go around the rule in order to collect content for their systems.<\/p>\n\n\n\n<p>This move comes after increased pressure from publishers, which claim that AI companies use their content to train AI-driven summaries without credit or permission.<\/p>\n\n\n\n<p>It will update the Robots Exclusion Protocol, commonly known as &#8220;robots.txt,&#8221; a popular platform used to indicate areas of a website that are writable to its bots. Second, Reddit will continue to make use of rate-limiting to limit the number of requests coming in from any single entity while it blocks unknown bots and crawlers on its platform from scraping any data\u2014collecting and storing raw data.<\/p>\n\n\n\n<p>It&#8217;s a file that has, really only recently, taken center stage as a salient tool for publishers seeking to prevent tech companies from scraping their content gratis to train AI algorithms and summary generation in response to some search queries. Content licensing startup TollBit last week told publishers that several AI firms were bypassing the above-mentioned web standard to scrape sites. This comes a day after publication Wired revealed AI search startup Perplexity avoided its web crawler blockades through robots.txt.<\/p>\n\n\n\n<p>Last week, the business media publisher Forbes accused Perplexity of plagiarizing its investigative stories for use in generative AI systems without proper credit. Reddit will still allow researchers and organizations like the Internet Archive access to its content for non-commercial purposes.<\/p>\n\n\n\n<ul class=\"wp-block-outermost-social-sharing is-layout-flex wp-block-outermost-social-sharing-is-layout-flex\"><li class=\"outermost-social-sharing-link outermost-social-sharing-link-x  wp-block-outermost-social-sharing-link\">\n\t<a href=\"https:\/\/x.com\/share?url=https%3A%2F%2Femgruppen.com%2Findex.php%2F2024%2F07%2F05%2Freddit-to-enhance-web-standards-to-block-automated-data-scraping%2F&#038;text=Reddit%20to%20Enhance%20Web%20Standards%20to%20Block%20Automated%20Data%20Scraping\" aria-label=\"Share on X\" rel=\"noopener nofollow\" target=\"_blank\" class=\"wp-block-outermost-social-sharing-link-anchor\">\n\t\t<svg width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" version=\"1.1\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><path d=\"M13.982 10.622 20.54 3h-1.554l-5.693 6.618L8.745 3H3.5l6.876 10.007L3.5 21h1.554l6.012-6.989L15.868 21h5.245l-7.131-10.378Zm-2.128 2.474-.697-.997-5.543-7.93H8l4.474 6.4.697.996 5.815 8.318h-2.387l-4.745-6.787Z\"><\/path><\/svg>\t\t<span class=\"wp-block-outermost-social-sharing-link-label screen-reader-text\">\n\t\t\tShare on X\t\t<\/span>\n\t<\/a>\n<\/li>\n\n\n<li class=\"outermost-social-sharing-link outermost-social-sharing-link-facebook  wp-block-outermost-social-sharing-link\">\n\t<a href=\"https:\/\/www.facebook.com\/sharer\/sharer.php?u=https%3A%2F%2Femgruppen.com%2Findex.php%2F2024%2F07%2F05%2Freddit-to-enhance-web-standards-to-block-automated-data-scraping%2F&#038;title=Reddit%20to%20Enhance%20Web%20Standards%20to%20Block%20Automated%20Data%20Scraping\" aria-label=\"Share on Facebook\" rel=\"noopener nofollow\" target=\"_blank\" class=\"wp-block-outermost-social-sharing-link-anchor\">\n\t\t<svg width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" version=\"1.1\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><path d=\"M12 2C6.5 2 2 6.5 2 12c0 5 3.7 9.1 8.4 9.9v-7H7.9V12h2.5V9.8c0-2.5 1.5-3.9 3.8-3.9 1.1 0 2.2.2 2.2.2v2.5h-1.3c-1.2 0-1.6.8-1.6 1.6V12h2.8l-.4 2.9h-2.3v7C18.3 21.1 22 17 22 12c0-5.5-4.5-10-10-10z\"><\/path><\/svg>\t\t<span class=\"wp-block-outermost-social-sharing-link-label screen-reader-text\">\n\t\t\tShare on Facebook\t\t<\/span>\n\t<\/a>\n<\/li>\n\n\n<li class=\"outermost-social-sharing-link outermost-social-sharing-link-reddit  wp-block-outermost-social-sharing-link\">\n\t<a href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Femgruppen.com%2Findex.php%2F2024%2F07%2F05%2Freddit-to-enhance-web-standards-to-block-automated-data-scraping%2F&#038;title=Reddit%20to%20Enhance%20Web%20Standards%20to%20Block%20Automated%20Data%20Scraping\" aria-label=\"Share on Reddit\" rel=\"noopener nofollow\" target=\"_blank\" class=\"wp-block-outermost-social-sharing-link-anchor\">\n\t\t<svg width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" version=\"1.1\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><path d=\"M22 12.068a2.184 2.184 0 0 0-2.186-2.186c-.592 0-1.13.233-1.524.609-1.505-1.075-3.566-1.774-5.86-1.864l1.004-4.695 3.261.699A1.56 1.56 0 1 0 18.255 3c-.61-.001-1.147.357-1.398.877l-3.638-.77a.382.382 0 0 0-.287.053.348.348 0 0 0-.161.251l-1.112 5.233c-2.33.072-4.426.77-5.95 1.864a2.201 2.201 0 0 0-1.523-.61 2.184 2.184 0 0 0-.896 4.176c-.036.215-.053.43-.053.663 0 3.37 3.924 6.111 8.763 6.111s8.763-2.724 8.763-6.11c0-.216-.017-.449-.053-.664A2.207 2.207 0 0 0 22 12.068Zm-15.018 1.56a1.56 1.56 0 0 1 3.118 0c0 .86-.699 1.558-1.559 1.558-.86.018-1.559-.699-1.559-1.559Zm8.728 4.139c-1.076 1.075-3.119 1.147-3.71 1.147-.61 0-2.652-.09-3.71-1.147a.4.4 0 0 1 0-.573.4.4 0 0 1 .574 0c.68.68 2.114.914 3.136.914 1.022 0 2.473-.233 3.136-.914a.4.4 0 0 1 .574 0 .436.436 0 0 1 0 .573Zm-.287-2.563a1.56 1.56 0 0 1 0-3.118c.86 0 1.56.699 1.56 1.56 0 .841-.7 1.558-1.56 1.558Z\"><\/path><\/svg>\t\t<span class=\"wp-block-outermost-social-sharing-link-label screen-reader-text\">\n\t\t\tShare on Reddit\t\t<\/span>\n\t<\/a>\n<\/li>\n\n\n<li class=\"outermost-social-sharing-link outermost-social-sharing-link-linkedin  wp-block-outermost-social-sharing-link\">\n\t<a href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Femgruppen.com%2Findex.php%2F2024%2F07%2F05%2Freddit-to-enhance-web-standards-to-block-automated-data-scraping%2F&#038;title=Reddit%20to%20Enhance%20Web%20Standards%20to%20Block%20Automated%20Data%20Scraping\" aria-label=\"Share on LinkedIn\" rel=\"noopener nofollow\" target=\"_blank\" class=\"wp-block-outermost-social-sharing-link-anchor\">\n\t\t<svg width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" version=\"1.1\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><path d=\"M19.7,3H4.3C3.582,3,3,3.582,3,4.3v15.4C3,20.418,3.582,21,4.3,21h15.4c0.718,0,1.3-0.582,1.3-1.3V4.3 C21,3.582,20.418,3,19.7,3z M8.339,18.338H5.667v-8.59h2.672V18.338z M7.004,8.574c-0.857,0-1.549-0.694-1.549-1.548 c0-0.855,0.691-1.548,1.549-1.548c0.854,0,1.547,0.694,1.547,1.548C8.551,7.881,7.858,8.574,7.004,8.574z M18.339,18.338h-2.669 v-4.177c0-0.996-0.017-2.278-1.387-2.278c-1.389,0-1.601,1.086-1.601,2.206v4.249h-2.667v-8.59h2.559v1.174h0.037 c0.356-0.675,1.227-1.387,2.526-1.387c2.703,0,3.203,1.779,3.203,4.092V18.338z\"><\/path><\/svg>\t\t<span class=\"wp-block-outermost-social-sharing-link-label screen-reader-text\">\n\t\t\tShare on LinkedIn\t\t<\/span>\n\t<\/a>\n<\/li>\n\n\n<li class=\"outermost-social-sharing-link outermost-social-sharing-link-whatsapp  wp-block-outermost-social-sharing-link\">\n\t<a href=\"https:\/\/api.whatsapp.com\/send?text=Reddit%20to%20Enhance%20Web%20Standards%20to%20Block%20Automated%20Data%20Scraping%20&mdash;%20https%3A%2F%2Femgruppen.com%2Findex.php%2F2024%2F07%2F05%2Freddit-to-enhance-web-standards-to-block-automated-data-scraping%2F\" aria-label=\"Share on WhatsApp\" rel=\"noopener nofollow\" target=\"_blank\" class=\"wp-block-outermost-social-sharing-link-anchor\">\n\t\t<svg width=\"24\" height=\"24\" viewBox=\"0 0 24 24\" version=\"1.1\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><path d=\"M 12.011719 2 C 6.5057187 2 2.0234844 6.478375 2.0214844 11.984375 C 2.0204844 13.744375 2.4814687 15.462563 3.3554688 16.976562 L 2 22 L 7.2324219 20.763672 C 8.6914219 21.559672 10.333859 21.977516 12.005859 21.978516 L 12.009766 21.978516 C 17.514766 21.978516 21.995047 17.499141 21.998047 11.994141 C 22.000047 9.3251406 20.962172 6.8157344 19.076172 4.9277344 C 17.190172 3.0407344 14.683719 2.001 12.011719 2 z M 12.009766 4 C 14.145766 4.001 16.153109 4.8337969 17.662109 6.3417969 C 19.171109 7.8517969 20.000047 9.8581875 19.998047 11.992188 C 19.996047 16.396187 16.413812 19.978516 12.007812 19.978516 C 10.674812 19.977516 9.3544062 19.642812 8.1914062 19.007812 L 7.5175781 18.640625 L 6.7734375 18.816406 L 4.8046875 19.28125 L 5.2851562 17.496094 L 5.5019531 16.695312 L 5.0878906 15.976562 C 4.3898906 14.768562 4.0204844 13.387375 4.0214844 11.984375 C 4.0234844 7.582375 7.6067656 4 12.009766 4 z M 8.4765625 7.375 C 8.3095625 7.375 8.0395469 7.4375 7.8105469 7.6875 C 7.5815469 7.9365 6.9355469 8.5395781 6.9355469 9.7675781 C 6.9355469 10.995578 7.8300781 12.182609 7.9550781 12.349609 C 8.0790781 12.515609 9.68175 15.115234 12.21875 16.115234 C 14.32675 16.946234 14.754891 16.782234 15.212891 16.740234 C 15.670891 16.699234 16.690438 16.137687 16.898438 15.554688 C 17.106437 14.971687 17.106922 14.470187 17.044922 14.367188 C 16.982922 14.263188 16.816406 14.201172 16.566406 14.076172 C 16.317406 13.951172 15.090328 13.348625 14.861328 13.265625 C 14.632328 13.182625 14.464828 13.140625 14.298828 13.390625 C 14.132828 13.640625 13.655766 14.201187 13.509766 14.367188 C 13.363766 14.534188 13.21875 14.556641 12.96875 14.431641 C 12.71875 14.305641 11.914938 14.041406 10.960938 13.191406 C 10.218937 12.530406 9.7182656 11.714844 9.5722656 11.464844 C 9.4272656 11.215844 9.5585938 11.079078 9.6835938 10.955078 C 9.7955938 10.843078 9.9316406 10.663578 10.056641 10.517578 C 10.180641 10.371578 10.223641 10.267562 10.306641 10.101562 C 10.389641 9.9355625 10.347156 9.7890625 10.285156 9.6640625 C 10.223156 9.5390625 9.737625 8.3065 9.515625 7.8125 C 9.328625 7.3975 9.131125 7.3878594 8.953125 7.3808594 C 8.808125 7.3748594 8.6425625 7.375 8.4765625 7.375 z\"><\/path><\/svg>\t\t<span class=\"wp-block-outermost-social-sharing-link-label screen-reader-text\">\n\t\t\tShare on WhatsApp\t\t<\/span>\n\t<\/a>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Social media entity Reddit (RDDT.N) said on Tuesday it would enhance a web standard utilized by the site to block automated data scraping, as reports have lately emerged that some AI startups go around the rule in order to collect content for their systems. This move comes after increased pressure from publishers, which claim that [&hellip;]<\/p>\n","protected":false},"author":99,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"give_campaign_id":0,"om_disable_all_campaigns":false,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[153],"tags":[241,235,246,233],"class_list":["post-1039","post","type-post","status-publish","format-standard","hentry","category-tech-news","tag-cybersecurity-2","tag-emgruppen","tag-reddit","tag-tech-2"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/posts\/1039","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/users\/99"}],"replies":[{"embeddable":true,"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/comments?post=1039"}],"version-history":[{"count":0,"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/posts\/1039\/revisions"}],"wp:attachment":[{"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/media?parent=1039"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/categories?post=1039"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/emgruppen.com\/index.php\/wp-json\/wp\/v2\/tags?post=1039"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}