python - scrapy crawler not working from home page -


I have a scraper written in scrap.contrib

  trying to collect items Scroller is I import Spider Link to Crawlspider, Scrappy.Conditib. Linkprektrosrsksmuh imports from scrapy.selector import selector SgmlLinkExtractor .. import item category GinakSpider (CrawlSpider): name = "ginak" start_urls = [ "http: //www.shop.ginakdesigns .com / main.sc"] rules = [rule (SgmlLinkExtractor (permission = [r 'category \ .sc \? category Id = \ d +'])), rules (SgmlLinkExtractor (permission = [r'product \ .sc \? ProductId = \ d + & amp; category Id = \ d + ']), callback = "Pars_aitem']] DRF Parse_item (self-response): sel = selector (response) self.log (response.url) item = Items.GinakItem () before [ 'name' ] = Sel.xpath ('// * [[id = "wrapper2"] / div / div / div [1] / div / div / div [2] / div / div / div [1] / div [1] / h2 / text () '). Remove [' item '] [' value '] = sel.xpath (' // * [@ id = "listprice"] / text () '). '] = Sel.xpath (' // * [[id = "wrapper2"] / div / div / div [1] / div / div / div [2] / div / div / div [1] / div [4] ] / Div / p / lesson () '). Remove () item ['category'] = sel.xpath ('// * [@ id = "breadcrumbs"] / a [2] / text ()'). Remove () Refund Item   

Although it does not go into any link outside the home page. I have tried all kinds of things and have also checked my regular expressions for SgmlLinkExtractor. Is there anything wrong here?

The problem is that the jsessionid is included in the link that you want to remove For example,:

  & lt; a href = "/ category.sc; jsessionid = EA2CAA7A3949F4E462BBF466E03755B7.m1plqscsfapp05? Sreniaidi = 16" & gt;  / :    

/

  Rules = [rule (SgmlLinkExtractor (permission = [r 'category \ .sc. *. Series id = \ d +']), callback = 'Pars_tim' rule (Sjielelinkaktractor (permission = [r'product \ .sc. *? ProductId = \ d + and amp; category Id = \ d + ']), callback = "Pars_aitem']]   

hope That helps.

Comments

Popular posts from this blog

c - Mpirun hangs when mpi send and recieve is put in a loop -

python - Apply coupon to a customer's subscription based on non-stripe related actions on the site -

java - Unable to get JDBC connection in Spring application to MySQL -