2025-11-13 04:30:17 [scrapy.utils.log] INFO: Scrapy 2.12.0 started (bot: zomato_dining) 2025-11-13 04:30:17 [scrapy.utils.log] INFO: Versions: lxml 5.3.0.0, libxml2 2.12.9, cssselect 1.2.0, parsel 1.9.1, w3lib 2.2.1, Twisted 24.11.0, Python 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0], pyOpenSSL 24.3.0 (OpenSSL 3.4.0 22 Oct 2024), cryptography 44.0.0, Platform Linux-6.8.0-1040-aws-aarch64-with-glibc2.35 2025-11-13 04:30:17 [dining_insights_zomato] INFO: Dynamic attribute _job = 123_2025-11-13T04_30_02 2025-11-13 04:30:17 [dining_insights_zomato] INFO: Dynamic attribute scheduled_job_id = 8051 2025-11-13 04:30:17 [dining_insights_zomato] INFO: Dynamic attribute SCRAPEOPS_JOB_NAME = 123 2025-11-13 04:30:17 [scrapy.addons] INFO: Enabled addons: [] 2025-11-13 04:30:17 [py.warnings] WARNING: /home/ubuntu/restaverse_spiders/venv/lib/python3.10/site-packages/scrapy/utils/request.py:120: ScrapyDeprecationWarning: 'REQUEST_FINGERPRINTER_IMPLEMENTATION' is a deprecated setting. It will be removed in a future version of Scrapy. return cls(crawler) 2025-11-13 04:30:17 [asyncio] DEBUG: Using selector: EpollSelector 2025-11-13 04:30:17 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor 2025-11-13 04:30:17 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop 2025-11-13 04:30:17 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor 2025-11-13 04:30:17 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.unix_events._UnixSelectorEventLoop 2025-11-13 04:30:17 [scrapy.extensions.telnet] INFO: Telnet Password: 75f0745cfd4909d1 2025-11-13 04:30:17 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole', 'scrapy.extensions.memusage.MemoryUsage', 'scrapy.extensions.logstats.LogStats', 'scrapy_extensions.extension.BandwidthLoggerExtension', 'scrapeops_scrapy.extension.ScrapeOpsMonitor'] 2025-11-13 04:30:17 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'zomato_dining', 'CONCURRENT_REQUESTS': 8, 'DOWNLOAD_DELAY': 0.5, 'FEED_EXPORT_ENCODING': 'utf-8', 'LOG_FILE': '/home/ubuntu/restaverse_spiders/logs/zomato_dining/dining_insights_zomato/123_2025-11-13T04_30_02.log', 'NEWSPIDER_MODULE': 'zomato_dining.spiders', 'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7', 'SPIDER_MODULES': ['zomato_dining.spiders'], 'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor', 'USER_AGENT': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, ' 'like Gecko) Chrome/127.0.0.0 Safari/537.36'} 2025-11-13 04:30:18 [faker.factory] DEBUG: Not in REPL -> leaving logger event level as is. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.address`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.address` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.automotive`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.automotive` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.bank`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Specified locale `en_US` is not available for provider `faker.providers.bank`. Locale reset to `en_GB` for this provider. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.barcode`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.barcode` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.color`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.color` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.company`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.company` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.credit_card`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.credit_card` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.currency`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.currency` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.date_time`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.date_time` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.emoji` does not feature localization. Specified locale `en_US` is not utilized for this provider. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.file` does not feature localization. Specified locale `en_US` is not utilized for this provider. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.geo`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.geo` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.internet`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.internet` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.isbn`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.isbn` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.job`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.job` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.lorem`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.lorem` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.misc`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.misc` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.passport`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.passport` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.person`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.person` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.phone_number`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.phone_number` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.profile` does not feature localization. Specified locale `en_US` is not utilized for this provider. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.python` does not feature localization. Specified locale `en_US` is not utilized for this provider. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.sbn` does not feature localization. Specified locale `en_US` is not utilized for this provider. 2025-11-13 04:30:18 [faker.factory] DEBUG: Looking for locale `en_US` in provider `faker.providers.ssn`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.ssn` has been localized to `en_US`. 2025-11-13 04:30:18 [faker.factory] DEBUG: Provider `faker.providers.user_agent` does not feature localization. Specified locale `en_US` is not utilized for this provider. 2025-11-13 04:30:18 [scrapy_fake_useragent.middleware] DEBUG: Loaded User-Agent provider: scrapy_fake_useragent.providers.FakerProvider 2025-11-13 04:30:18 [scrapy_fake_useragent.middleware] INFO: Using '' as the User-Agent provider 2025-11-13 04:30:18 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.offsite.OffsiteMiddleware', 'zomato_dining.middlewares.RedisCacheMiddleware', 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapeops_scrapy.middleware.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2025-11-13 04:30:18 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'zomato_dining.middlewares.ZomatoDiningSpiderMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2025-11-13 04:30:18 [scrapy.middleware] INFO: Enabled item pipelines: ['zomato_dining.pipelines.ZomatoDiningPipeline'] 2025-11-13 04:30:18 [scrapy.core.engine] INFO: Spider opened 2025-11-13 04:30:18 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) 2025-11-13 04:30:18 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): api.scrapeops.io:443 2025-11-13 04:30:20 [urllib3.connectionpool] DEBUG: https://api.scrapeops.io:443 "POST /api/v1/setup/ HTTP/11" 200 None 2025-11-13 04:30:20 [dining_insights_zomato] INFO: Spider opened: dining_insights_zomato 2025-11-13 04:30:20 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6037 2025-11-13 04:30:20 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): cron.restaverse.com:443 2025-11-13 04:30:20 [urllib3.connectionpool] DEBUG: https://cron.restaverse.com:443 "POST /api/db_services/fetch-query HTTP/11" 200 2867 2025-11-13 04:30:20 [scrapy_fake_useragent.middleware] DEBUG: Assign User-Agent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_8 rv:2.0; mhr-RU) AppleWebKit/535.49.2 (KHTML, like Gecko) Version/5.1 Safari/535.49.2 to Proxy http://scrapeops.country=in:dbc08e8a-4f73-4088-9f13-c2db0a774557@residential-proxy.scrapeops.io:8181 2025-11-13 04:30:21 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): api.scrapeops.io:443 2025-11-13 04:30:21 [urllib3.connectionpool] DEBUG: https://api.scrapeops.io:443 "POST /api/v1/normalizer/proxy_port/?job_id=5374795 HTTP/11" 200 None 2025-11-13 04:30:21 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): api.scrapeops.io:443 2025-11-13 04:30:22 [urllib3.connectionpool] DEBUG: https://api.scrapeops.io:443 "POST /api/v1/normalizer/domain/?domain=zomato.com HTTP/11" 200 241 2025-11-13 04:30:24 [scrapeops_scrapy.middleware.retry] DEBUG: Retrying (failed 1 times): Could not open CONNECT tunnel with proxy residential-proxy.scrapeops.io:8181 [{'status': 401, 'reason': b'Unauthorized'}] 2025-11-13 04:30:24 [scrapy_fake_useragent.middleware] DEBUG: Assign User-Agent Opera/8.56.(Windows NT 5.1; ga-IE) Presto/2.9.164 Version/12.00 to Proxy http://residential-proxy.scrapeops.io:8181 2025-11-13 04:30:26 [scrapeops_scrapy.middleware.retry] DEBUG: Retrying (failed 2 times): Could not open CONNECT tunnel with proxy residential-proxy.scrapeops.io:8181 [{'status': 401, 'reason': b'Unauthorized'}] 2025-11-13 04:30:29 [scrapeops_scrapy.middleware.retry] ERROR: Gave up retrying (failed 3 times): Could not open CONNECT tunnel with proxy residential-proxy.scrapeops.io:8181 [{'status': 401, 'reason': b'Unauthorized'}] 2025-11-13 04:30:29 [scrapy.core.engine] INFO: Closing spider (finished) 2025-11-13 04:30:29 [scrapy.core.engine] ERROR: Scraper close failure Traceback (most recent call last): File "/home/ubuntu/restaverse_spiders/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc return self._engine.get_loc(casted_key) File "index.pyx", line 167, in pandas._libs.index.IndexEngine.get_loc File "index.pyx", line 196, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'res_id' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/ubuntu/restaverse_spiders/venv/lib/python3.10/site-packages/twisted/internet/defer.py", line 1088, in _runCallbacks current.result = callback( # type: ignore[misc] File "/home/ubuntu/restaverse_spiders/eggs/zomato_dining/1762927172.egg/zomato_dining/pipelines.py", line 20, in close_spider res_id_list = result_df["res_id"].astype(str).unique().tolist() File "/home/ubuntu/restaverse_spiders/venv/lib/python3.10/site-packages/pandas/core/frame.py", line 4102, in __getitem__ indexer = self.columns.get_loc(key) File "/home/ubuntu/restaverse_spiders/venv/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3812, in get_loc raise KeyError(key) from err KeyError: 'res_id' 2025-11-13 04:30:29 [dining_insights_zomato] INFO: Logger Payload: {'run_id': '394650a0-3e8c-435f-a3e6-586bf1486e11', 'timestamp': '2025-11-13T04:30:29Z', 'spider': 'dining_insights_zomato', 'client_id': '123', 'domain': 'www.zomato.com', 'bytes_sent': 0, 'bytes_received': 0, 'duration_seconds': 10.36, 'host': 'ip-172-31-16-168', 'status': 'finished'} 2025-11-13 04:30:29 [urllib3.connectionpool] DEBUG: Starting new HTTP connection (1): watchdog.restaverse.com:9200 2025-11-13 04:30:29 [urllib3.connectionpool] DEBUG: http://watchdog.restaverse.com:9200 "POST /scrapy-2025-11-13/_doc HTTP/11" 201 165 2025-11-13 04:30:29 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): api.scrapeops.io:443 2025-11-13 04:30:29 [urllib3.connectionpool] DEBUG: https://api.scrapeops.io:443 "POST /api/v1/stats/ HTTP/11" 200 109 2025-11-13 04:30:29 [scrapy.statscollectors] INFO: Dumping Scrapy stats: {'downloader/exception_count': 3, 'downloader/exception_type_count/scrapy.core.downloader.handlers.http11.TunnelError': 3, 'downloader/request_bytes': 11280, 'downloader/request_count': 3, 'downloader/request_method_count/GET': 3, 'elapsed_time_seconds': 10.357804, 'finish_reason': 'finished', 'finish_time': datetime.datetime(2025, 11, 13, 4, 30, 29, 233681, tzinfo=datetime.timezone.utc), 'items_per_minute': None, 'log_count/DEBUG': 67, 'log_count/ERROR': 2, 'log_count/INFO': 13, 'log_count/WARNING': 1, 'memusage/max': 116330496, 'memusage/startup': 116330496, 'responses_per_minute': None, 'retry/count': 2, 'retry/max_reached': 1, 'retry/reason_count/scrapy.core.downloader.handlers.http11.TunnelError': 2, 'scheduler/dequeued': 3, 'scheduler/dequeued/memory': 3, 'scheduler/enqueued': 3, 'scheduler/enqueued/memory': 3, 'start_time': datetime.datetime(2025, 11, 13, 4, 30, 18, 875877, tzinfo=datetime.timezone.utc)} 2025-11-13 04:30:29 [scrapy.core.engine] INFO: Spider closed (finished)