Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
jjtorrens
GitHub Repository: jjtorrens/learnlatex.github.io
Path: blob/main/robots.txt
3163 views
1
################################# ROBOTS.TXT ###################################
2
# #
3
# Alphabetically ordered whitelisting of legitimate web robots, which obey the #
4
# Robots Exclusion Standard (robots.txt). Each bot is shortly described in a #
5
# comment above the (list of) user-agent(s). Comment out or delete lines which #
6
# contain User-agents you do not wish to allow on your website. #
7
# Important: Blank lines are not allowed in the final robots.txt file! #
8
# Updates can be retrieved from: https://github.com/jonasjacek/robots.txt #
9
# #
10
# This document is licensed with a CC BY-NC-SA 4.0 license. #
11
# #
12
# Last update: 2020-12-17 #
13
# #
14
################################################################################
15
# so.com chinese search engine
16
User-agent: 360Spider
17
User-agent: 360Spider-Image
18
User-agent: 360Spider-Video
19
# google.com landing page quality checks
20
User-agent: AdsBot-Google
21
User-agent: AdsBot-Google-Mobile
22
# google.com app resource fetcher
23
User-agent: AdsBot-Google-Mobile-Apps
24
# bing ads bot
25
User-agent: adidxbot
26
# apple.com search engine
27
User-agent: Applebot
28
user-agent: AppleNewsBot
29
# baidu.com chinese search engine
30
User-agent: Baiduspider
31
User-agent: Baiduspider-image
32
User-agent: Baiduspider-news
33
User-agent: Baiduspider-video
34
# bing.com international search engine
35
User-agent: bingbot
36
User-agent: BingPreview
37
# bublup.com suggestion/search engine
38
User-agent: BublupBot
39
# commoncrawl.org open repository of web crawl data
40
User-agent: CCBot
41
# cliqz.com german in-product search engine
42
User-agent: Cliqzbot
43
# coccoc.com vietnamese search engine
44
User-agent: coccoc
45
User-agent: coccocbot-image
46
User-agent: coccocbot-web
47
# daum.net korean search engine
48
User-agent: Daumoa
49
# dazoo.fr french search engine
50
User-agent: Dazoobot
51
# deusu.de german search engine
52
User-agent: DeuSu
53
# duckduckgo.com international privacy search engine
54
User-agent: DuckDuckBot
55
User-agent: DuckDuckGo-Favicons-Bot
56
# eurip.com european search engine
57
User-agent: EuripBot
58
# exploratodo.com latin search engine
59
User-agent: Exploratodo
60
# facebook.com social network
61
User-agent: Facebot
62
# feedly.com feed fetcher
63
User-agent: Feedly
64
# findx.com european search engine
65
User-agent: Findxbot
66
# goo.ne.jp japanese search engine
67
User-agent: gooblog
68
# google.com international search engine
69
User-agent: Googlebot
70
User-agent: Googlebot-Image
71
User-agent: Googlebot-Mobile
72
User-agent: Googlebot-News
73
User-agent: Googlebot-Video
74
# so.com chinese search engine
75
User-agent: HaoSouSpider
76
# goo.ne.jp japanese search engine
77
User-agent: ichiro
78
# istella.it italian search engine
79
User-agent: istellabot
80
# jike.com / chinaso.com chinese search engine
81
User-agent: JikeSpider
82
# lycos.com & hotbot.com international search engine
83
User-agent: Lycos
84
# mail.ru russian search engine
85
User-agent: Mail.Ru
86
# google.com adsense bot
87
User-agent: Mediapartners-Google
88
# mojeek.com search engine
89
User-agent: MojeekBot
90
# bing.com international search engine
91
User-agent: msnbot
92
User-agent: msnbot-media
93
# orange.com international search engine
94
User-agent: OrangeBot
95
# pinterest.com social networtk
96
User-agent: Pinterest
97
# botje.nl dutch search engine
98
User-agent: Plukkie
99
# qwant.com french search engine
100
User-agent: Qwantify
101
# rambler.ru russian search engine
102
User-agent: Rambler
103
# seznam.cz czech search engine
104
User-agent: SeznamBot
105
# soso.com chinese search engine
106
User-agent: Sosospider
107
# yahoo.com international search engine
108
User-agent: Slurp
109
# sogou.com chinese search engine
110
User-agent: Sogou blog
111
User-agent: Sogou inst spider
112
User-agent: Sogou News Spider
113
User-agent: Sogou Orion spider
114
User-agent: Sogou spider2
115
User-agent: Sogou web spider
116
# sputnik.ru russian search engine
117
User-agent: SputnikBot
118
# ask.com international search engine
119
User-agent: Teoma
120
# twitter.com bot
121
User-agent: Twitterbot
122
# wotbox.com international search engine
123
User-agent: wotbox
124
# yacy.net p2p search software
125
User-agent: yacybot
126
# yandex.com russian search engine
127
User-agent: Yandex
128
User-agent: YandexMobileBot
129
# search.naver.com south korean search engine
130
user-agent: Yeti
131
# yioop.com international search engine
132
User-agent: YioopBot
133
# yooz.ir iranian search engine
134
User-agent: yoozBot
135
# youdao.com chinese search engine
136
User-agent: YoudaoBot
137
# crawling rule(s) for above bots
138
Disallow:
139
# disallow all other bots
140
User-agent: *
141
Disallow: /
142
143