java - Nested html not being parsed by Jsoup -


I am trying to parse one page with JSOP, but the HTML is not correctly parsed.

The general structure is:

  & lt; Html & gt; & Lt; Top & gt; ... & lt; / Head & gt; & Lt; Frameset ... & gt; & Lt; Frame ... & gt; #document & lt; Html & gt; ... & lt; / Html & gt; & Lt; / Frame & gt; & Lt; / Frameset & gt; & Lt; / Html & gt;   

When I parse the html and print it, then document doc = jsoup.parse (html); System.out.println (doc.html ());
This external HTML (including #documents, but not frame or internal HTML) prints.

Do anyone know how to get internal HTML with JSOU, or should I consider using a different library? Thank you.

EDIT: Here I am parsing the site. I have a subscription; Do not know if anyone will tell you in this.

After authentication, it will take you into:

Edit 2:

    

Then I run:

  document doctor = jesop.ps (html); Elements ames = doc.select ("frameset> frame: last-child"); // print (ames); Switch (elems.size ()) {case: break; Case 1: Doctor = Jsoup.connect (elems.first (). Attr ("src")). Get (); break; Default: Break; } System.out.println (doc.html ());   

Persevered HTML (doc.html ()):

  & lt; Html & gt; & Lt; Top & gt; & Lt; / Head & gt; & Lt; Body & gt; & Amp; IUML; & Amp; Raquo; & Amp; Iquest; #document Hello Hello again & lt; / Body & gt; & Lt; / Html & gt;   

Then also & lt; Frameset & gt;

Is not anyone looking for ideas?

Ways to Paste Nested HTML:

  // Frameset Document Doc = Jesup. Connect ("Http://database.asahi.com/library/2/login/login.php") .get (); // login, password, etc. Set the frame URL to which you want to parse ... Note: I think that you want to parse the contents of the first frame elements elements = doc.select ("frameset & Gt; frame: first-child "); Switch (elts.size ()) {case 0: // no frame found ... break; Case 1: element frame alt = alts.fst (); Document frame doc = jsoup Connect (framelt.ttR ("src")) .get (); // Add frame dock node to doctor (FrameAlt # insert child) frameElt.insertChildren (0, frameDoc.childNodes ()); break; Default: // Strange Result ...} System.out.println (doc.html ());    

Comments

Popular posts from this blog

c - Mpirun hangs when mpi send and recieve is put in a loop -

python - Apply coupon to a customer's subscription based on non-stripe related actions on the site -

java - Unable to get JDBC connection in Spring application to MySQL -