1*6da8f8c4SAndroid Build Coastguard Workerjsoup Changelog Archive 2*6da8f8c4SAndroid Build Coastguard Worker 3*6da8f8c4SAndroid Build Coastguard WorkerContains change notes for versions 0.1.1 (2010-Jan-31) through 1.17.1 (2023-Nov-27). 4*6da8f8c4SAndroid Build Coastguard WorkerMore recent changes may be found in CHANGES.md. 5*6da8f8c4SAndroid Build Coastguard Worker 6*6da8f8c4SAndroid Build Coastguard WorkerRelease 1.17.1 [27-Nov-2023] 7*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Jsoup.connect(), added support for request-level authentication, supporting authentication to 8*6da8f8c4SAndroid Build Coastguard Worker proxies and to servers. 9*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2046> 10*6da8f8c4SAndroid Build Coastguard Worker 11*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in the Elements list, added direct support for `#set(index, element)`, `#remove(index)`, 12*6da8f8c4SAndroid Build Coastguard Worker `#remove(object)`, `#clear()`, `#removeAll(collection)`, `#retainAll(collection)`, `#removeIf(filter)`, 13*6da8f8c4SAndroid Build Coastguard Worker `#replaceAll(operator)`. These methods update the original DOM, as well as the Elements list. 14*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2017> 15*6da8f8c4SAndroid Build Coastguard Worker 16*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added the NodeIterator class, to efficiently traverse a node tree using the Iterator interface. And 17*6da8f8c4SAndroid Build Coastguard Worker added Stream Element#stream() and Node#nodeStream() methods, to enable fluent composable stream pipelines of node 18*6da8f8c4SAndroid Build Coastguard Worker traversals. 19*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2051> 20*6da8f8c4SAndroid Build Coastguard Worker 21*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when changing the OutputSettings syntax to XML, the xhtml EscapeMode is automatically set by default. 22*6da8f8c4SAndroid Build Coastguard Worker 23*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added the `:is(selector list)` pseudo-selector, which finds elements that match any of the selectors in 24*6da8f8c4SAndroid Build Coastguard Worker the selector list. Useful for making large ORed selectors more readable. 25*6da8f8c4SAndroid Build Coastguard Worker 26*6da8f8c4SAndroid Build Coastguard Worker * Improvement: repackaged the library with native (vs automatic) JPMS module support. 27*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2025> 28*6da8f8c4SAndroid Build Coastguard Worker 29*6da8f8c4SAndroid Build Coastguard Worker * Improvement: better fidelity of source positions when tracking is enabled. And implicitly created or closed elements 30*6da8f8c4SAndroid Build Coastguard Worker are tracked and detectable via Range.isImplicit(). 31*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2056> 32*6da8f8c4SAndroid Build Coastguard Worker 33*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when source tracking is enabled, the source position for attribute names and values is now available. 34*6da8f8c4SAndroid Build Coastguard Worker Attribute#sourceRange() provides the ranges. 35*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2057> 36*6da8f8c4SAndroid Build Coastguard Worker 37*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when running concurrently under Java 21+ Virtual Threads, virtual threads could be pinned to their 38*6da8f8c4SAndroid Build Coastguard Worker carrier platform thread when parsing an input stream. To improve performance, particularly when parsing fetched 39*6da8f8c4SAndroid Build Coastguard Worker URLs, the internal ConstrainableInputStream has been replaced by ControllableInputStream, which avoids the locking 40*6da8f8c4SAndroid Build Coastguard Worker which caused that pinning. 41*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2054> 42*6da8f8c4SAndroid Build Coastguard Worker 43*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Jsoup.Connect, allow any XML mimetype as a supported mimetype. Was previously limited to 44*6da8f8c4SAndroid Build Coastguard Worker `{application|text}/xml`. This enables for e.g. fetching SVGs with a image/svg+xml mimetype, without having to 45*6da8f8c4SAndroid Build Coastguard Worker disable mimetype validation. 46*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2059> 47*6da8f8c4SAndroid Build Coastguard Worker 48*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when outputting with XML syntax, HTML elements that were parsed as data nodes (<script> and <style>) should 49*6da8f8c4SAndroid Build Coastguard Worker be emitted as CDATA nodes, so that they can be parsed correctly by an XML parser. 50*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1720> 51*6da8f8c4SAndroid Build Coastguard Worker 52*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the Immediate Parent selector `>` could match elements above the root context element, causing incorrect 53*6da8f8c4SAndroid Build Coastguard Worker elements to be returned when used on elements other than the root document. 54*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2018> 55*6da8f8c4SAndroid Build Coastguard Worker 56*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in a sub-query such as `p:has(> span, > i)`, combinators following the `,` Or combinator would be 57*6da8f8c4SAndroid Build Coastguard Worker incorrectly skipped, such that the sub-query was parsed as `i` instead of `> i`. 58*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1707> 59*6da8f8c4SAndroid Build Coastguard Worker 60*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in W3CDom, if the jsoup input document contained an empty doctype, the conversion would fail with a 61*6da8f8c4SAndroid Build Coastguard Worker DOMException. Now, said doctype is discarded, and the conversion continues. 62*6da8f8c4SAndroid Build Coastguard Worker 63*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when cleaning a document containing SVG elements (or other foreign elements that have preserved case names), 64*6da8f8c4SAndroid Build Coastguard Worker the cleaned output would be incorrectly nested if the safelist had a different case than the input document. 65*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2049> 66*6da8f8c4SAndroid Build Coastguard Worker 67*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when cleaning a document, the output style of unknown self-closing tags from the input was not preserved in 68*6da8f8c4SAndroid Build Coastguard Worker the output. (So a <foo /> in the input, if safe-listed, would be output as <foo></foo>.) 69*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2049> 70*6da8f8c4SAndroid Build Coastguard Worker 71*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: added a local test proxy implementation, for proxy integration tests. 72*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2029> 73*6da8f8c4SAndroid Build Coastguard Worker 74*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: added tests for HTTPS request support, using a local self-signed cert. Includes proxy tests. 75*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2032> 76*6da8f8c4SAndroid Build Coastguard Worker 77*6da8f8c4SAndroid Build Coastguard Worker * Change: the InputStream returned in Connection.Response.bodyStream() is no longer a ConstrainedInputStream, and 78*6da8f8c4SAndroid Build Coastguard Worker so is not subject to settings such as timeout or maximum size. It is now a plain BufferedInputStream around the 79*6da8f8c4SAndroid Build Coastguard Worker response stream. Whilst this behaviour was not documented, you may have been inadvertently relying on those 80*6da8f8c4SAndroid Build Coastguard Worker constraints. The constraints are still applied to other methods such as .parse() and .bufferUp(). So if you do want 81*6da8f8c4SAndroid Build Coastguard Worker a constrained BufferedInputStream, you may do Connection.Response.bufferUp().bodyStream(). 82*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2054> 83*6da8f8c4SAndroid Build Coastguard Worker 84*6da8f8c4SAndroid Build Coastguard WorkerRelease 1.16.2 [20-Oct-2023] 85*6da8f8c4SAndroid Build Coastguard Worker * Improvement: optimized the performance of complex CSS selectors, by adding a cost-based query planner. Evaluators 86*6da8f8c4SAndroid Build Coastguard Worker are sorted by their relative execution cost, and executed in order of lower to higher cost. This speeds the 87*6da8f8c4SAndroid Build Coastguard Worker matching process by ensuring that simpler evaluations (such as a tag name match) are conducted prior to more 88*6da8f8c4SAndroid Build Coastguard Worker complex evaluations (such as an attribute regex, or a deep child scan with a :has). 89*6da8f8c4SAndroid Build Coastguard Worker 90*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added support for <svg> and <math> tags (and their children). This includes tag namespaces and case 91*6da8f8c4SAndroid Build Coastguard Worker preservation on applicable tags and attributes. 92*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2008> 93*6da8f8c4SAndroid Build Coastguard Worker 94*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when converting jsoup Documents to W3C Documents in W3CDom, HTML documents will be placed in the 95*6da8f8c4SAndroid Build Coastguard Worker `http://www.w3.org/1999/xhtml` namespace by default, per the HTML5 spec. This can be controlled by setting 96*6da8f8c4SAndroid Build Coastguard Worker `W3CDom#namespaceAware(false)`. 97*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1848> 98*6da8f8c4SAndroid Build Coastguard Worker 99*6da8f8c4SAndroid Build Coastguard Worker * Improvement: speed optimized the Structural Evaluators by memoizing previous evaluations. Particularly the `~` 100*6da8f8c4SAndroid Build Coastguard Worker (any preceding sibling) and `:nth-of-type` selectors are improved. 101*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1956> 102*6da8f8c4SAndroid Build Coastguard Worker 103*6da8f8c4SAndroid Build Coastguard Worker * Improvement: tweaked the performance of the Element nextElementSibling, previousElementSibling, firstElementSibling, 104*6da8f8c4SAndroid Build Coastguard Worker lastElementSibling, firstElementChild, and lastElementChild. They now inplace filter/skip in the child-node list, vs 105*6da8f8c4SAndroid Build Coastguard Worker having to allocate and scan a complete Element filtered list. 106*6da8f8c4SAndroid Build Coastguard Worker 107*6da8f8c4SAndroid Build Coastguard Worker * Improvement: optimized internal methods that previously called Element.children() to use filter/skip child-node list 108*6da8f8c4SAndroid Build Coastguard Worker accessors instead, reducing new Element List allocations. 109*6da8f8c4SAndroid Build Coastguard Worker 110*6da8f8c4SAndroid Build Coastguard Worker * Improvement: tweaked the performance of parsing :pseudo selectors. 111*6da8f8c4SAndroid Build Coastguard Worker 112*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when using the `:empty` pseudo-selector, blank textnodes are now considered empty. Previously, 113*6da8f8c4SAndroid Build Coastguard Worker an element containing any whitespace was not considered empty. 114*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1976> 115*6da8f8c4SAndroid Build Coastguard Worker 116*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in forms, <input type="image"> should be excluded from formData() (and hence from form submissions). 117*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/2010> 118*6da8f8c4SAndroid Build Coastguard Worker 119*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Safelist, made isSafeTag and isSafeAttribute public methods, for extensibility. 120*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1780> 121*6da8f8c4SAndroid Build Coastguard Worker 122*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: `form` elements and empty elements (such as `img`) did not have their attributes de-duplicated. 123*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1950> 124*6da8f8c4SAndroid Build Coastguard Worker 125*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if Document.OutputSettings was cloned from a clone, an NPE would be thrown when used. 126*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1964> 127*6da8f8c4SAndroid Build Coastguard Worker 128*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in Jsoup.connect(url), URL paths containing a %2B were incorrectly recoded to a '+', or a '+' was recoded 129*6da8f8c4SAndroid Build Coastguard Worker to a ' '. Fixed by reverting to the previous behavior of not encoding supplied paths, other than normalizing to 130*6da8f8c4SAndroid Build Coastguard Worker ASCII. 131*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1952> 132*6da8f8c4SAndroid Build Coastguard Worker 133*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in Jsoup.connect(url), strings containing supplemental characters (e.g. emoji) were not URL escaped 134*6da8f8c4SAndroid Build Coastguard Worker correctly. 135*6da8f8c4SAndroid Build Coastguard Worker 136*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in Jsoup.connect(url), the ConstrainableInputStream would clear Thread interrupts when reading the body. 137*6da8f8c4SAndroid Build Coastguard Worker This precluded callers from spawning a thread, running a number of requests for a length of time, then joining that 138*6da8f8c4SAndroid Build Coastguard Worker thread after interrupting it. 139*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1991> 140*6da8f8c4SAndroid Build Coastguard Worker 141*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when tracking HTML source positions, the closing tags for H1...H6 elements were not tracked correctly. 142*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1987> 143*6da8f8c4SAndroid Build Coastguard Worker 144*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in Jsoup.connect(), a DELETE method request did not support a request body. 145*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1972> 146*6da8f8c4SAndroid Build Coastguard Worker 147*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when calling Element.cssSelector() on an extremely deeply nested element, a StackOverflowError could occur. 148*6da8f8c4SAndroid Build Coastguard Worker Further, a StackOverflowError may occur when running the query. 149*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2001> 150*6da8f8c4SAndroid Build Coastguard Worker 151*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: appending a node back to its original Element after empty() would throw an Index out of bounds exception. 152*6da8f8c4SAndroid Build Coastguard Worker Also, now the child nodes that were removed have their parent node cleared, fully detaching them from the original 153*6da8f8c4SAndroid Build Coastguard Worker parent. 154*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/2013> 155*6da8f8c4SAndroid Build Coastguard Worker 156*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in Jsoup.Connection when adding headers, the value may have been assumed to be an incorrectly decoded 157*6da8f8c4SAndroid Build Coastguard Worker ISO_8859_1 string, and re-encoded as UTF-8. The value is now left as-is. 158*6da8f8c4SAndroid Build Coastguard Worker 159*6da8f8c4SAndroid Build Coastguard Worker * Change: removed previously deprecated methods Document#normalise, Element#forEach(org.jsoup.helper.Consumer<>), 160*6da8f8c4SAndroid Build Coastguard Worker Node#forEach(org.jsoup.helper.Consumer<>), and the org.jsoup.helper.Consumer interface; the latter being a 161*6da8f8c4SAndroid Build Coastguard Worker previously required compatibility shim prior to Android's de-sugaring support. 162*6da8f8c4SAndroid Build Coastguard Worker 163*6da8f8c4SAndroid Build Coastguard Worker * Change: the previous compatibility shim org.jsoup.UncheckedIOException is deprecated in favor of the now supported 164*6da8f8c4SAndroid Build Coastguard Worker java.io.UncheckedIOException. If you are catching the former, modify your code to catch the latter instead. 165*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1989> 166*6da8f8c4SAndroid Build Coastguard Worker 167*6da8f8c4SAndroid Build Coastguard Worker * Change: blocked noscript tags from being added to Safelists, due to incompatibilities between parsers with and 168*6da8f8c4SAndroid Build Coastguard Worker without script-mode enabled. 169*6da8f8c4SAndroid Build Coastguard Worker 170*6da8f8c4SAndroid Build Coastguard WorkerRelease 1.16.1 [29-Apr-2023] 171*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Jsoup.connect(url), natively support URLs with Unicode characters in the path or query string, 172*6da8f8c4SAndroid Build Coastguard Worker without having to be escaped by the caller. 173*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1914> 174*6da8f8c4SAndroid Build Coastguard Worker 175*6da8f8c4SAndroid Build Coastguard Worker * Improvement: Calling Node.remove() on a node with no parent is now a no-op, vs a validation error. 176*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1898> 177*6da8f8c4SAndroid Build Coastguard Worker 178*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: aligned the HTML Tree Builder processing steps for AfterBody and AfterAfterBody to the updated WHATWG 179*6da8f8c4SAndroid Build Coastguard Worker standard, to not pop the stack to close <body> or <html> elements. This prevents an errant </html> closing preceding 180*6da8f8c4SAndroid Build Coastguard Worker structure. Also added appropriate error message outputs in this case. 181*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1851> 182*6da8f8c4SAndroid Build Coastguard Worker 183*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Corrected support for ruby elements (<ruby>, <rp>, <rt>, and <rtc>) to current spec. 184*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1294> 185*6da8f8c4SAndroid Build Coastguard Worker 186*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: When using Node.before(node) or Node.after(node), if the incoming node was a sibling of the context node, 187*6da8f8c4SAndroid Build Coastguard Worker the incoming node may be inserted into the wrong relative location. 188*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1898> 189*6da8f8c4SAndroid Build Coastguard Worker 190*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: In Jsoup.connect(url), if the input URL had components that were already % escaped, they would be escaped 191*6da8f8c4SAndroid Build Coastguard Worker again, causing errors when fetched. 192*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1902> 193*6da8f8c4SAndroid Build Coastguard Worker 194*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when tracking input source positions, text in tables that was fostered had invalid positions. 195*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1927> 196*6da8f8c4SAndroid Build Coastguard Worker 197*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: If the Document.OutputSettings class was initialized, and then Entities.escape(String) called, an NPE may be 198*6da8f8c4SAndroid Build Coastguard Worker thrown due to a class loading circular dependency. 199*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1910> 200*6da8f8c4SAndroid Build Coastguard Worker 201*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when pretty-printing, the first inline Element or Comment in a block would not be wrap-indented if it were 202*6da8f8c4SAndroid Build Coastguard Worker preceded by a blank text node. 203*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1906> 204*6da8f8c4SAndroid Build Coastguard Worker 205*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when pretty-printing a <pre> containing block tags, those tags were incorrectly indented. 206*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1891> 207*6da8f8c4SAndroid Build Coastguard Worker 208*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when pretty-printing nested inlineable blocks (such as a <p> in a <td>), the inner element should be 209*6da8f8c4SAndroid Build Coastguard Worker indented. 210*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1926> 211*6da8f8c4SAndroid Build Coastguard Worker 212*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: <br> tags should be wrap-indented when in block tags (and not when in inline tags). 213*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1911> 214*6da8f8c4SAndroid Build Coastguard Worker 215*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the contents of a sufficiently large <textarea> with un-escaped HTML closing tags may be incorrectly parsed 216*6da8f8c4SAndroid Build Coastguard Worker to an empty node. 217*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1929> 218*6da8f8c4SAndroid Build Coastguard Worker 219*6da8f8c4SAndroid Build Coastguard WorkerRelease 1.15.4 [18-Feb-2023] 220*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added the ability to escape CSS selectors (tags, IDs, classes) to match elements that don't follow 221*6da8f8c4SAndroid Build Coastguard Worker regular CSS syntax. For example, to match by classname <p class="one.two">, use document.select("p.one\\.two"); 222*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/838> 223*6da8f8c4SAndroid Build Coastguard Worker 224*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when pretty-printing, wrap text that follows a <br> tag. 225*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1858> 226*6da8f8c4SAndroid Build Coastguard Worker 227*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when pretty-printing, normalize newlines that follow self-closing tags in custom tags. 228*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1852> 229*6da8f8c4SAndroid Build Coastguard Worker 230*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when pretty-printing, collapse non-significant whitespace between a block and an inline tag. 231*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1802> 232*6da8f8c4SAndroid Build Coastguard Worker 233*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Element#forEach and Node#forEachNode, use java.util.function.Consumer instead of the previous 234*6da8f8c4SAndroid Build Coastguard Worker Android compatibility shim org.jsoup.helper.Consumer. Subsequently, the latter has been deprecated. 235*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1870> 236*6da8f8c4SAndroid Build Coastguard Worker 237*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added a new method Document#forms(), to conveniently retrieve a List<FormElement> containing the <form> 238*6da8f8c4SAndroid Build Coastguard Worker elements in a document. 239*6da8f8c4SAndroid Build Coastguard Worker 240*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added a new method Document#expectForm(query), to find the first matching FormElement, or blow up 241*6da8f8c4SAndroid Build Coastguard Worker trying. 242*6da8f8c4SAndroid Build Coastguard Worker 243*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: URLs containing characters such as [ and ] were not escaped correctly, and would throw a 244*6da8f8c4SAndroid Build Coastguard Worker MalformedURLException when fetched. 245*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1873> 246*6da8f8c4SAndroid Build Coastguard Worker 247*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Element.cssSelector would create invalid selectors for elements where the tag name, ID, or classnames needed 248*6da8f8c4SAndroid Build Coastguard Worker to be escaped (e.g. if a class name contained a ':' or '.'). 249*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1742> 250*6da8f8c4SAndroid Build Coastguard Worker 251*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: element.text() should have a space between a block and an inline element. 252*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1877> 253*6da8f8c4SAndroid Build Coastguard Worker 254*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if a Node or an Element was replaced with itself, that node would incorrectly be orphaned. 255*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1843> 256*6da8f8c4SAndroid Build Coastguard Worker 257*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: form data on a previous request was copied to a new request in newRequest(), resulting in an accumulation of 258*6da8f8c4SAndroid Build Coastguard Worker form data when executing multi-step form submissions, or data sent to later requests incorrectly. Now, newRequest() 259*6da8f8c4SAndroid Build Coastguard Worker only copies session related settings (cookies, proxy settings, user-agent, etc) but not the request data nor the 260*6da8f8c4SAndroid Build Coastguard Worker body. 261*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1778> 262*6da8f8c4SAndroid Build Coastguard Worker 263*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an issue in Safelist.removeAttributes which could throw a ConcurrentModificationException when using 264*6da8f8c4SAndroid Build Coastguard Worker the ":all" pseudo-attribute. 265*6da8f8c4SAndroid Build Coastguard Worker 266*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: given extremely deeply nested HTML, a number of methods in Element could throw a StackOverflowError due 267*6da8f8c4SAndroid Build Coastguard Worker to excessive recursion. Namely: #data(), #hasText(), #parents(), and #wrap(html). 268*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1864> 269*6da8f8c4SAndroid Build Coastguard Worker 270*6da8f8c4SAndroid Build Coastguard Worker * Change: deprecated the unused Document#normalise() method. Normalization occurs during the HTML tree construction, 271*6da8f8c4SAndroid Build Coastguard Worker and no longer as a distinct phase. 272*6da8f8c4SAndroid Build Coastguard Worker 273*6da8f8c4SAndroid Build Coastguard WorkerRelease 1.15.3 [2022-Aug-24] 274*6da8f8c4SAndroid Build Coastguard Worker * Security: fixed an issue where the jsoup cleaner may incorrectly sanitize crafted XSS attempts if 275*6da8f8c4SAndroid Build Coastguard Worker SafeList.preserveRelativeLinks is enabled. 276*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/security/advisories/GHSA-gp7f-rwcx-9369> 277*6da8f8c4SAndroid Build Coastguard Worker 278*6da8f8c4SAndroid Build Coastguard Worker * Improvement: the Cleaner will preserve the source position of cleaned elements, if source tracking is enabled in the 279*6da8f8c4SAndroid Build Coastguard Worker original parse. 280*6da8f8c4SAndroid Build Coastguard Worker 281*6da8f8c4SAndroid Build Coastguard Worker * Improvement: the error messages output from Validate are more descriptive. Exceptions are now ValidationExceptions 282*6da8f8c4SAndroid Build Coastguard Worker (extending IllegalArgumentException). Stack traces do not include the Validate class, to make it simpler to see 283*6da8f8c4SAndroid Build Coastguard Worker where the exception originated. Common validation errors including malformed URLs and empty selector results have 284*6da8f8c4SAndroid Build Coastguard Worker more explicit error messages. 285*6da8f8c4SAndroid Build Coastguard Worker 286*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the DataUtil would incorrectly read from InputStreams that emitted reads less than the requested size. This 287*6da8f8c4SAndroid Build Coastguard Worker lead to incorrect results when parsing from chunked server responses, for example. 288*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1807> 289*6da8f8c4SAndroid Build Coastguard Worker 290*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: added implementation version and related fields to the jar manifest. 291*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1809> 292*6da8f8c4SAndroid Build Coastguard Worker 293*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.15.2 [2022-Jul-04] 294*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added the ability to track the position (line, column, index) in the original input source from where 295*6da8f8c4SAndroid Build Coastguard Worker a given node was parsed. Accessible via Node.sourceRange() and Element.endSourceRange(). 296*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1790> 297*6da8f8c4SAndroid Build Coastguard Worker 298*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element.firstElementChild(), Element.lastElementChild(), Node.firstChild(), Node.lastChild(), 299*6da8f8c4SAndroid Build Coastguard Worker as convenient accessors to those child nodes and elements. 300*6da8f8c4SAndroid Build Coastguard Worker 301*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element.expectFirst(cssQuery), which is just like Element.selectFirst(), but instead of returning 302*6da8f8c4SAndroid Build Coastguard Worker a null if there is no match, will throw an IllegalArgumentException. This is useful if you want to simply abort 303*6da8f8c4SAndroid Build Coastguard Worker processing if an expected match is not found. 304*6da8f8c4SAndroid Build Coastguard Worker 305*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when pretty-printing HTML, doctypes are emitted on a newline if there is a preceding comment. 306*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1664> 307*6da8f8c4SAndroid Build Coastguard Worker 308*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when pretty-printing, trim the leading and trailing spaces of textnodes in block tags when possible, 309*6da8f8c4SAndroid Build Coastguard Worker so that they are indented correctly. 310*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1798> 311*6da8f8c4SAndroid Build Coastguard Worker 312*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Element#selectXpath(), disable namespace awareness. This makes it possible to always select elements 313*6da8f8c4SAndroid Build Coastguard Worker by their simple local name, regardless of whether an xmlns attribute was set. 314*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1801> 315*6da8f8c4SAndroid Build Coastguard Worker 316*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when using the readToByteBuffer method, such as in Connection.Response.body(), if the document has not 317*6da8f8c4SAndroid Build Coastguard Worker already been parsed and must be read fully, and there is any maximum buffer size being applied, only the default 318*6da8f8c4SAndroid Build Coastguard Worker internal buffer size is read. 319*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1774> 320*6da8f8c4SAndroid Build Coastguard Worker 321*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when serializing HTML, newlines in elements descending from a pre tag were incorrectly skipped. That caused 322*6da8f8c4SAndroid Build Coastguard Worker what should have been preformatted output to instead be a run of text. 323*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1776> 324*6da8f8c4SAndroid Build Coastguard Worker 325*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when pretty-print serializing HTML, newlines separating phrasing content (e.g. a <span> tag within a <p> tag 326*6da8f8c4SAndroid Build Coastguard Worker would be incorrectly skipped, instead of normalized to a space. Additionally, improved space normalization between 327*6da8f8c4SAndroid Build Coastguard Worker other end of line occurrences, and whitespace handling after a closing </body> 328*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1787> 329*6da8f8c4SAndroid Build Coastguard Worker 330*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.15.1 [2022-May-15] 331*6da8f8c4SAndroid Build Coastguard Worker * Change: removed previously deprecated methods and classes (including org.jsoup.safety.Whitelist; use 332*6da8f8c4SAndroid Build Coastguard Worker org.jsoup.safety.Safelist instead). 333*6da8f8c4SAndroid Build Coastguard Worker 334*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when converting jsoup Documents to W3C Documents in W3CDom, preserve HTML valid attribute names if the 335*6da8f8c4SAndroid Build Coastguard Worker input document is using the HTML syntax. (Previously, would always coerce using the more restrictive XML syntax.) 336*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1648> 337*6da8f8c4SAndroid Build Coastguard Worker 338*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added the :containsWholeText(text) selector, to match against non-normalized Element text. That can be 339*6da8f8c4SAndroid Build Coastguard Worker useful when elements can only be distinguished by e.g. specific case, or leading whitespace, etc. 340*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1636> 341*6da8f8c4SAndroid Build Coastguard Worker 342*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element#wholeOwnText() to retrieve the original (non-normalized) ownText of an Element. Also 343*6da8f8c4SAndroid Build Coastguard Worker added the :containsWholeOwnText(text) selector, to match against that. BR elements are now treated as newlines 344*6da8f8c4SAndroid Build Coastguard Worker in the wholeText methods. 345*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1636> 346*6da8f8c4SAndroid Build Coastguard Worker 347*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added the :matchesWholeText(regex) and :matchesWholeOwnText(regex) selectors, to match against whole 348*6da8f8c4SAndroid Build Coastguard Worker (non-normalized, case sensitive) element text and own text, respectively. 349*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1636> 350*6da8f8c4SAndroid Build Coastguard Worker 351*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when evaluating an XPath query against a context element, the complete document is now visible to the 352*6da8f8c4SAndroid Build Coastguard Worker query, vs only the context element's sub-tree. This enables support for queries outside (parent or sibling) the 353*6da8f8c4SAndroid Build Coastguard Worker element, e.g. ancestor-or-self::*. 354*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1652> 355*6da8f8c4SAndroid Build Coastguard Worker 356*6da8f8c4SAndroid Build Coastguard Worker * Improvement: allow a maxPaddingWidth on the indent level in OutputSettings when pretty printing. This defaults to 357*6da8f8c4SAndroid Build Coastguard Worker 30 to limit the indent level for very deeply nested elements, and may be disabled by setting to -1. 358*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1655> 359*6da8f8c4SAndroid Build Coastguard Worker 360*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when cloning a Node or an Element, the clone gets a cloned OwnerDocument containing only that clone, so 361*6da8f8c4SAndroid Build Coastguard Worker as to preserve applicable settings, such as the Pretty Print settings. 362*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/763> 363*6da8f8c4SAndroid Build Coastguard Worker 364*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added a convenience method Jsoup.parse(File). 365*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1693> 366*6da8f8c4SAndroid Build Coastguard Worker 367*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in the NodeTraversor, added default implementations for NodeVisitor.tail() and NodeFilter.tail(), so 368*6da8f8c4SAndroid Build Coastguard Worker that code using only head() methods can be written as lambdas. 369*6da8f8c4SAndroid Build Coastguard Worker 370*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in NodeTraversor, added support for removing nodes via Node.remove() during NodeVisitor.head(). 371*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1699> 372*6da8f8c4SAndroid Build Coastguard Worker 373*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Node.forEachNode(Consumer<Node>) and Element.forEach(Consumer<Element) methods, to efficiently 374*6da8f8c4SAndroid Build Coastguard Worker traverse the DOM with a functional interface. 375*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1700> 376*6da8f8c4SAndroid Build Coastguard Worker 377*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: boolean attribute names should be case-insensitive, but were not when the parser was configured to preserve 378*6da8f8c4SAndroid Build Coastguard Worker case. 379*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1656> 380*6da8f8c4SAndroid Build Coastguard Worker 381*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when reading from SequenceInputStreams across the buffer, the input stream was closed too early, resulting 382*6da8f8c4SAndroid Build Coastguard Worker in missed content. 383*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1671> 384*6da8f8c4SAndroid Build Coastguard Worker 385*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: a comment with all dashes (<!----->) should not emit a parse error. 386*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1667> 387*6da8f8c4SAndroid Build Coastguard Worker 388*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when throwing a SelectorParseException for an invalid selector, don't try to String.format the input, as 389*6da8f8c4SAndroid Build Coastguard Worker that could throw an IllegalFormatException. 390*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1691> 391*6da8f8c4SAndroid Build Coastguard Worker 392*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when serializing HTML with Pretty Print enabled, extraneous whitespace may be added on closing tags, or 393*6da8f8c4SAndroid Build Coastguard Worker extra newlines may be added at the end of script blocks. 394*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1688> 395*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1689> 396*6da8f8c4SAndroid Build Coastguard Worker 397*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when copy-creating a Safelist from another, perform a deep-copy of the original's settings, so that changes 398*6da8f8c4SAndroid Build Coastguard Worker to the original after creation do not affect the copy. 399*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1763> 400*6da8f8c4SAndroid Build Coastguard Worker 401*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: speed improvement when parsing constructed HTML containing very deeply incorrectly stacked formatting 402*6da8f8c4SAndroid Build Coastguard Worker elements with many attributes. 403*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1695> 404*6da8f8c4SAndroid Build Coastguard Worker 405*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: during parsing, a StackOverflowException was possible given crafted HTML with hundreds of nested 406*6da8f8c4SAndroid Build Coastguard Worker table elements followed by invalid formatting elements. 407*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1697> 408*6da8f8c4SAndroid Build Coastguard Worker 409*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.14.3 [2021-Sep-30] 410*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added native XPath support in Element#selectXpath(String) 411*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1629> 412*6da8f8c4SAndroid Build Coastguard Worker 413*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added full support for the <template> tag to the HTML5 parser spec. 414*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1634> 415*6da8f8c4SAndroid Build Coastguard Worker 416*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added support in CharacterReader to track newlines, so that parse errors can be reported more 417*6da8f8c4SAndroid Build Coastguard Worker intuitively. 418*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1624> 419*6da8f8c4SAndroid Build Coastguard Worker 420*6da8f8c4SAndroid Build Coastguard Worker * Improvement: tracked parse errors now have more details, including the erroneous token, to help clarify the errors. 421*6da8f8c4SAndroid Build Coastguard Worker 422*6da8f8c4SAndroid Build Coastguard Worker * Improvement: speed and memory optimizations for the :has(subquery) selector. 423*6da8f8c4SAndroid Build Coastguard Worker 424*6da8f8c4SAndroid Build Coastguard Worker * Improvement: the :contains(text) and :containsOwn(text) selectors are now whitespace normalized, aligning to the 425*6da8f8c4SAndroid Build Coastguard Worker document text that they are matching against. 426*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/876> 427*6da8f8c4SAndroid Build Coastguard Worker 428*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Element, speed optimized adopting all of an element's child nodes into a currently empty element. 429*6da8f8c4SAndroid Build Coastguard Worker Improves the HTML adoption agency algorithm when adopting elements with many children. 430*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1638> 431*6da8f8c4SAndroid Build Coastguard Worker 432*6da8f8c4SAndroid Build Coastguard Worker * Improvement: increased the parse speed when in RCData (e.g. <title>) and unescaped <tag> tokens are found, by 433*6da8f8c4SAndroid Build Coastguard Worker memoizing the </title> scan and reducing GC. 434*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1644> 435*6da8f8c4SAndroid Build Coastguard Worker 436*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when parsing custom tags (in HTML or XML), added a flyweight cache on Tag.valueOf(name) to reduce 437*6da8f8c4SAndroid Build Coastguard Worker memory overhead when many tags are repeated. Also tuned other areas of the parser when many very deeply stacked 438*6da8f8c4SAndroid Build Coastguard Worker custom elements were present. 439*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1646> 440*6da8f8c4SAndroid Build Coastguard Worker 441*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when tracking errors or checking for validity in the Cleaner, errors were incorrectly raised for missing 442*6da8f8c4SAndroid Build Coastguard Worker optional closing tags. 443*6da8f8c4SAndroid Build Coastguard Worker 444*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the OSGi bundle meta-data incorrectly set a version on the import of javax.annotation (used as a build-time 445*6da8f8c4SAndroid Build Coastguard Worker dependency for nullability assertions). 446*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1616> 447*6da8f8c4SAndroid Build Coastguard Worker 448*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the Attributes::equals() method was sensitive to the order of its contents, but it should not be. 449*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1492> 450*6da8f8c4SAndroid Build Coastguard Worker 451*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when the HTML parser was configured to preserve case, Element text methods would miss adding whitespace for 452*6da8f8c4SAndroid Build Coastguard Worker "BR" tags. 453*6da8f8c4SAndroid Build Coastguard Worker 454*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: attribute names are now normalized & validated correctly for the specific output syntax (HTML or XML). 455*6da8f8c4SAndroid Build Coastguard Worker Previously, syntactically invalid attribute names could be output by the html() methods. Such attributes are still 456*6da8f8c4SAndroid Build Coastguard Worker available in the DOM, and will be normalized if possible on output. 457*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1474> 458*6da8f8c4SAndroid Build Coastguard Worker 459*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: fixed an IOOB when an empty select tag was followed by a body tag that needed reparenting. 460*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1639> 461*6da8f8c4SAndroid Build Coastguard Worker 462*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: fixed nullability annotations for Node.equals(other) and other equals methods. 463*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1628> 464*6da8f8c4SAndroid Build Coastguard Worker 465*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: added JDK 17 to the CI builds. 466*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1641> 467*6da8f8c4SAndroid Build Coastguard Worker 468*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.14.2 [2021-Aug-15] 469*6da8f8c4SAndroid Build Coastguard Worker * Improvement: support Pattern.quote \Q and \E escapes in the selector regex matchers. 470*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1536> 471*6da8f8c4SAndroid Build Coastguard Worker 472*6da8f8c4SAndroid Build Coastguard Worker * Improvement: Element.absUrl() now supports tel: URLs, and other URLs that are already absolute but that Java does 473*6da8f8c4SAndroid Build Coastguard Worker not have input stream handlers for. 474*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1610> 475*6da8f8c4SAndroid Build Coastguard Worker 476*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when serializing output, escape characters that are in the < 0x20 range. This improves XML output 477*6da8f8c4SAndroid Build Coastguard Worker compatibility, and makes HTML output with these characters easier to read (as they're otherwise invisible). 478*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1556> 479*6da8f8c4SAndroid Build Coastguard Worker 480*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the *|el wildcard namespace selector now also matches elements with no namespace. 481*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1565> 482*6da8f8c4SAndroid Build Coastguard Worker 483*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: corrected a potential case of the parser input stream not being closed immediately on a read exception. 484*6da8f8c4SAndroid Build Coastguard Worker 485*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when making a HTTP POST, if the request write fails, make sure the connection is immediately cleaned up. 486*6da8f8c4SAndroid Build Coastguard Worker 487*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in the XML parser, XML processing instructions without attributes would be serialized as if they did. 488*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/770> 489*6da8f8c4SAndroid Build Coastguard Worker 490*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: updated the HtmlTreeParser resetInsertionMode to the current spec for supported elements. 491*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1491> 492*6da8f8c4SAndroid Build Coastguard Worker 493*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an NPE when parsing fragment HTML into a standalone table element. 494*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1603> 495*6da8f8c4SAndroid Build Coastguard Worker 496*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an NPE when parsing fragment heading HTML into a standalone p element. 497*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1601> 498*6da8f8c4SAndroid Build Coastguard Worker 499*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an IOOB when parsing a formatting fragment into a standalone p element. 500*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1602> 501*6da8f8c4SAndroid Build Coastguard Worker 502*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: tag names must start with an ascii-alpha character. 503*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1006> 504*6da8f8c4SAndroid Build Coastguard Worker 505*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: fixed a slow parse when a tag or an attribute name has thousands of null characters in it. 506*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1580> 507*6da8f8c4SAndroid Build Coastguard Worker 508*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: the adoption agency algorithm can have an incorrect bookmark position 509*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1576> 510*6da8f8c4SAndroid Build Coastguard Worker 511*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: malformed HTML could result in null elements on stack 512*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1579> 513*6da8f8c4SAndroid Build Coastguard Worker 514*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: malformed deeply nested table elements could create a stack overflow. 515*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1577> 516*6da8f8c4SAndroid Build Coastguard Worker 517*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Speed optimized malformed HTML creating elements with thousands of elements - limit the attribute 518*6da8f8c4SAndroid Build Coastguard Worker count per element when parsing to 512 (in real-world HTML, P99 is ~ 8). 519*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1578> 520*6da8f8c4SAndroid Build Coastguard Worker 521*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Speed improvement for the foster formatting elements algo, by limiting how far up a crafted stack 522*6da8f8c4SAndroid Build Coastguard Worker to scan. 523*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1593> 524*6da8f8c4SAndroid Build Coastguard Worker 525*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Speed improvement when parsing crafted HTML when transferring form attributes. 526*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1595> 527*6da8f8c4SAndroid Build Coastguard Worker 528*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Speed improvement when the stack was thousands of items deep, and non-matching close tags sent. 529*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1596> 530*6da8f8c4SAndroid Build Coastguard Worker 531*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Speed improvement when an attribute name is 600K of quote characters or otherwise needs accumulation 532*6da8f8c4SAndroid Build Coastguard Worker vs being able to read in one hit. 533*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1605> 534*6da8f8c4SAndroid Build Coastguard Worker 535*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Speed improvement when closing missing empty tags (in XML comment processed as HTML) when thousands 536*6da8f8c4SAndroid Build Coastguard Worker deep in stack. 537*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1606> 538*6da8f8c4SAndroid Build Coastguard Worker 539*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Fix a potential stack-overflow in the parser given crafted HTML, when the parser looped in the 540*6da8f8c4SAndroid Build Coastguard Worker InSelectInTable state. 541*6da8f8c4SAndroid Build Coastguard Worker 542*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Fix an IOOB when the HTML root was cleared from the stack and then attributes were merged onto it. 543*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1611> 544*6da8f8c4SAndroid Build Coastguard Worker 545*6da8f8c4SAndroid Build Coastguard Worker * Bugfix [Fuzz]: Improved the speed of parsing when crafted HTML contains hundreds of active formatting elements 546*6da8f8c4SAndroid Build Coastguard Worker that were copied for all new elements (similar to an amplification attack). The number of considered active 547*6da8f8c4SAndroid Build Coastguard Worker formatting elements that will be cloned when mis-nested is now capped to 12. 548*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1613> 549*6da8f8c4SAndroid Build Coastguard Worker 550*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.14.1 [2021-Jul-10] 551*6da8f8c4SAndroid Build Coastguard Worker * Change: updated the minimum supported Java version from Java 7 to Java 8. 552*6da8f8c4SAndroid Build Coastguard Worker 553*6da8f8c4SAndroid Build Coastguard Worker * Change: updated the minimum Android API level from 8 to 10. 554*6da8f8c4SAndroid Build Coastguard Worker 555*6da8f8c4SAndroid Build Coastguard Worker * Change: although Node#childNodes() returns an UnmodifiableList as a view into its children, it was still 556*6da8f8c4SAndroid Build Coastguard Worker directly backed by the internal child list. That made some uses, such as looping and moving those children to 557*6da8f8c4SAndroid Build Coastguard Worker another element, throw a ConcurrentModificationException. Now this method returns its own list so that they are 558*6da8f8c4SAndroid Build Coastguard Worker separated and changes to the parent's contents will not impact the children view. This aligns with similar methods 559*6da8f8c4SAndroid Build Coastguard Worker such as Element#children(). If you have code that iterates this list and makes parenting changes to its contents, 560*6da8f8c4SAndroid Build Coastguard Worker you may need to make a code update. 561*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1431> 562*6da8f8c4SAndroid Build Coastguard Worker 563*6da8f8c4SAndroid Build Coastguard Worker * Change: the org.jsoup.Connection interface has been modified to introduce new methods for sessions and the cookie 564*6da8f8c4SAndroid Build Coastguard Worker store. If you have a custom implementation of this interface, you will need to add implementations of these methods. 565*6da8f8c4SAndroid Build Coastguard Worker 566*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added HTTP request session management support with Jsoup.newSession(). This extends the Connection 567*6da8f8c4SAndroid Build Coastguard Worker implementation to support (optional) sessions, which allow request defaults (timeout, proxy, etc) to be set once and 568*6da8f8c4SAndroid Build Coastguard Worker then applied to all requests within that session. 569*6da8f8c4SAndroid Build Coastguard Worker 570*6da8f8c4SAndroid Build Coastguard Worker Cookies are re-implemented to correctly support path and domain filtering when used within a session. A default 571*6da8f8c4SAndroid Build Coastguard Worker in-memory cookie store is used for the session, or a custom implementation (perhaps disk-persistent, or pre-set) 572*6da8f8c4SAndroid Build Coastguard Worker can be used instead. 573*6da8f8c4SAndroid Build Coastguard Worker 574*6da8f8c4SAndroid Build Coastguard Worker Forms submitted using the FormElement#submit() use the same session that was used to fetch the document and so pass 575*6da8f8c4SAndroid Build Coastguard Worker cookies and other defaults appropriately. 576*6da8f8c4SAndroid Build Coastguard Worker 577*6da8f8c4SAndroid Build Coastguard Worker The session is multi-thread safe and can execute multiple requests concurrently. If the user accidentally tries to 578*6da8f8c4SAndroid Build Coastguard Worker execute the same request object across multiple threads (vs calling Connection#newRequest()), 579*6da8f8c4SAndroid Build Coastguard Worker that is detected cleanly and a clear exception is thrown (vs weird blowups in input stream reading, or forcing 580*6da8f8c4SAndroid Build Coastguard Worker everything through a synchronized bottleneck. 581*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1476> 582*6da8f8c4SAndroid Build Coastguard Worker 583*6da8f8c4SAndroid Build Coastguard Worker * Improvement: renamed the Whitelist class to Safelist, with the goal of more inclusive language. A shim is provided 584*6da8f8c4SAndroid Build Coastguard Worker for backwards compatibility (source and binary). This shim is marked as deprecated and will be removed in the 585*6da8f8c4SAndroid Build Coastguard Worker jsoup 1.15.1 release. 586*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1464> 587*6da8f8c4SAndroid Build Coastguard Worker 588*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added support for Internationalized Domain Names (IDNs) in Jsoup.Connect. 589*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1300> 590*6da8f8c4SAndroid Build Coastguard Worker 591*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added support for loading and parsing gzipped HTML files in Jsoup.parse(File in, charset, baseUri). 592*6da8f8c4SAndroid Build Coastguard Worker 593*6da8f8c4SAndroid Build Coastguard Worker * Improvement: reduced thread contention in HttpConnection and Document. 594*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1455> 595*6da8f8c4SAndroid Build Coastguard Worker 596*6da8f8c4SAndroid Build Coastguard Worker * Improvement: better parsing performance when under high thread concurrency 597*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1402> 598*6da8f8c4SAndroid Build Coastguard Worker 599*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element#id(String) ID attribute setter. 600*6da8f8c4SAndroid Build Coastguard Worker 601*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in Document, #body() and #head() accessors will now automatically create those elements, if they were 602*6da8f8c4SAndroid Build Coastguard Worker missing (e.g. if the Document was not parsed from HTML). Additionally, the #body() method returns the frameset 603*6da8f8c4SAndroid Build Coastguard Worker element (instead of null) for frameset documents. 604*6da8f8c4SAndroid Build Coastguard Worker 605*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when cleaning a document, the output settings of the original document are cloned into the cleaned 606*6da8f8c4SAndroid Build Coastguard Worker document. 607*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1417> 608*6da8f8c4SAndroid Build Coastguard Worker 609*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when parsing XML, disable pretty-printing by default. 610*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1168> 611*6da8f8c4SAndroid Build Coastguard Worker 612*6da8f8c4SAndroid Build Coastguard Worker * Improvement: much better performance in Node#clone() for large and deeply nested documents. Complexity was O(n^2) or 613*6da8f8c4SAndroid Build Coastguard Worker worse, now O(n). 614*6da8f8c4SAndroid Build Coastguard Worker 615*6da8f8c4SAndroid Build Coastguard Worker * Improvement: during traversal using the NodeTraversor, nodes may now be replaced with Node#replaceWith(Node). 616*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1289> 617*6da8f8c4SAndroid Build Coastguard Worker 618*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element#insertChildren and Element#prependChildren, as convenience methods in addition to 619*6da8f8c4SAndroid Build Coastguard Worker Element#insertChildren(index, children), for bulk moving nodes. 620*6da8f8c4SAndroid Build Coastguard Worker 621*6da8f8c4SAndroid Build Coastguard Worker * Improvement: clean up relative URLs with too many .. segments better. 622*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1482> 623*6da8f8c4SAndroid Build Coastguard Worker 624*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: integrated jsoup into the OSS Fuzz project, which semi-randomly generates millions of different 625*6da8f8c4SAndroid Build Coastguard Worker HTML and XML input files, searching for areas to improve in the parser for increased robustness and throughput. 626*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1502> 627*6da8f8c4SAndroid Build Coastguard Worker 628*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: integrated with GitHub's CodeQL static code analyzer. 629*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1494> 630*6da8f8c4SAndroid Build Coastguard Worker 631*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: moved to GitHub Workflows for build verification. 632*6da8f8c4SAndroid Build Coastguard Worker 633*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: updated Jetty (used for integration tests; not bundled) to 9.4.42. 634*6da8f8c4SAndroid Build Coastguard Worker 635*6da8f8c4SAndroid Build Coastguard Worker * Build Improvement: added nullability annotations and initial settings. 636*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1467> 637*6da8f8c4SAndroid Build Coastguard Worker 638*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: corrected the adoption agency algorithm, to handle cases where e.g. a <a> tag incorrectly nests further <a> 639*6da8f8c4SAndroid Build Coastguard Worker tags. 640*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1517> <https://github.com/jhy/jsoup/issues/845> 641*6da8f8c4SAndroid Build Coastguard Worker 642*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing HTML, could throw NPEs on some tags (isindex or table>input). 643*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1404> 644*6da8f8c4SAndroid Build Coastguard Worker 645*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in HttpConnection.Request, headers beginning with "sec-" (e.g. Sec-Fetch-Mode) were silently discarded by 646*6da8f8c4SAndroid Build Coastguard Worker the underlying Java HttpURLConnection. These are now settable correctly. 647*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1461> 648*6da8f8c4SAndroid Build Coastguard Worker 649*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when adding child Nodes to a Node, could incorrectly reparent all nodes if the first parent had the same 650*6da8f8c4SAndroid Build Coastguard Worker length of children as the incoming node list. 651*6da8f8c4SAndroid Build Coastguard Worker 652*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when wrapping an orphaned element, would throw an NPE. 653*6da8f8c4SAndroid Build Coastguard Worker 654*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when wrapping an element with HTML that included multiple sibling elements, those siblings were incorrectly 655*6da8f8c4SAndroid Build Coastguard Worker added as children of the wrapper instead of siblings. 656*6da8f8c4SAndroid Build Coastguard Worker 657*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when setting the content of a script or style tag via the Element#html(String) method, the content is now 658*6da8f8c4SAndroid Build Coastguard Worker treated as a DataNode, not a TextNode. This means that characters like '<' will no longer be incorrectly escaped. 659*6da8f8c4SAndroid Build Coastguard Worker As a related ergonomic improvement, the same behavior applies for Element#text(String) (i.e. the content will be 660*6da8f8c4SAndroid Build Coastguard Worker treated as a DataNode, despite calling the text() method. 661*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1419> 662*6da8f8c4SAndroid Build Coastguard Worker 663*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when wrapping HTML around an existing element with Element#wrap(String), will now take the content as 664*6da8f8c4SAndroid Build Coastguard Worker provided and ignore normal HTML tree-building rules. This allows for e.g. a div tag to be placed inside of p tags. 665*6da8f8c4SAndroid Build Coastguard Worker 666*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the Elements#forms() method should return the selected immediate elements that are Forms, not children. 667*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1403> 668*6da8f8c4SAndroid Build Coastguard Worker 669*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when creating a selector for an element with Element#cssSelector, if the element used a non-unique ID 670*6da8f8c4SAndroid Build Coastguard Worker attribute, the returned selector may not match the desired element. 671*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1085> 672*6da8f8c4SAndroid Build Coastguard Worker 673*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: corrected the toString() methods of the Evaluator classes. 674*6da8f8c4SAndroid Build Coastguard Worker 675*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when converting a jsoup document to a W3C document (in W3CDom#convert), if a tag had XML illegal characters, 676*6da8f8c4SAndroid Build Coastguard Worker a DOMException would be thrown. Now instead, that tag is represented as a text node. 677*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1093> 678*6da8f8c4SAndroid Build Coastguard Worker 679*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if a HTML file ended with an open noscript tag, an "EOF" string would appear in the HTML output. 680*6da8f8c4SAndroid Build Coastguard Worker 681*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing a document as XML, automatically set the output syntax to XML, and ensure that "<" characters 682*6da8f8c4SAndroid Build Coastguard Worker in attributes are escaped as "<" (which is not required in HTML as the quoted attribute contents are safe, but is 683*6da8f8c4SAndroid Build Coastguard Worker required in XML). 684*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1420> 685*6da8f8c4SAndroid Build Coastguard Worker 686*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: [Fuzz] when parsing an attribute key containing "abs:abs", a validation error would be incorrectly 687*6da8f8c4SAndroid Build Coastguard Worker thrown. 688*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1541> 689*6da8f8c4SAndroid Build Coastguard Worker 690*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: [Fuzz] could NPE while parsing in resetInsertionMode(). 691*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1538> 692*6da8f8c4SAndroid Build Coastguard Worker 693*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: [Fuzz] when parsing XML, could Stack Overflow when parsing XML declarations. 694*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1539> 695*6da8f8c4SAndroid Build Coastguard Worker 696*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: [Fuzz] fixed a potential Stack Overflow when parsing mis-nested tfoot tags, and updated the tree parser for 697*6da8f8c4SAndroid Build Coastguard Worker this situation to match the updated HTML5 spec. 698*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1543> 699*6da8f8c4SAndroid Build Coastguard Worker 700*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: [Fuzz] fixed a potentially slow HTML parse when tags are nested extremely deep (e.g. 88K depth), by limiting 701*6da8f8c4SAndroid Build Coastguard Worker the formatting tag search depth to 256. In practice, it's generally between 4 - 8. 702*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1544> 703*6da8f8c4SAndroid Build Coastguard Worker 704*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: [Fuzz] when parsing an unterminated RCDATA token (e.g. a <title> tag), could throw an IO Exception "No 705*6da8f8c4SAndroid Build Coastguard Worker buffer left to unconsume" when trying to rewind the buffer. 706*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1542> 707*6da8f8c4SAndroid Build Coastguard Worker 708*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.13.1 [2020-Feb-29] 709*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element#closest(selector), which walks up the tree to find the nearest element matching the 710*6da8f8c4SAndroid Build Coastguard Worker selector. 711*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1326> 712*6da8f8c4SAndroid Build Coastguard Worker 713*6da8f8c4SAndroid Build Coastguard Worker * Improvement: memory optimizations, reducing the retained size of a Document by ~ 39%, and allocations by ~ 9%: 714*6da8f8c4SAndroid Build Coastguard Worker 1. Attributes holder in Elements is only created if the element has attributes 715*6da8f8c4SAndroid Build Coastguard Worker 2. Only track the baseUri in an element when it is set via DOM to a new value for a given tree 716*6da8f8c4SAndroid Build Coastguard Worker 3. After parsing, do not retain the input character reader (and associated buffers) in the Document#parser 717*6da8f8c4SAndroid Build Coastguard Worker 718*6da8f8c4SAndroid Build Coastguard Worker * Improvement: substantial parse speed improvements vs 1.12.x (bringing back to par with previous releases). 719*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1327> 720*6da8f8c4SAndroid Build Coastguard Worker 721*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when pretty-printing, comments in inline tags are not pushed to a newline 722*6da8f8c4SAndroid Build Coastguard Worker 723*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Attributes#hasDeclaredValueForKey(key) and Attribute#hasDeclaredValueForKeyIgnoreCase(), to check 724*6da8f8c4SAndroid Build Coastguard Worker if an attribute is set but has no value. Useful in place of the deprecated and removed BooleanAttribute class and 725*6da8f8c4SAndroid Build Coastguard Worker instanceof test. 726*6da8f8c4SAndroid Build Coastguard Worker 727*6da8f8c4SAndroid Build Coastguard Worker * Improvement: removed old methods and classes that were marked deprecated in previous releases. 728*6da8f8c4SAndroid Build Coastguard Worker 729*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element#select(Evaluator) and Element#selectFirst(Evaluator), to allow re-use of a parsed CSS 730*6da8f8c4SAndroid Build Coastguard Worker selector if using the same evaluator many times. 731*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1319> 732*6da8f8c4SAndroid Build Coastguard Worker 733*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Elements#forms(), Elements#textNodes(), Elements#dataNodes(), and Elements#comments(), as a 734*6da8f8c4SAndroid Build Coastguard Worker convenient way to get access to these node types directly from an element selection. 735*6da8f8c4SAndroid Build Coastguard Worker 736*6da8f8c4SAndroid Build Coastguard Worker * Improvement: preserve whitespace before html and head tag, if pretty-printing is off. 737*6da8f8c4SAndroid Build Coastguard Worker 738*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in a <select> tag, a second <optgroup> would not automatically close an earlier open <optgroup> 739*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1313> 740*6da8f8c4SAndroid Build Coastguard Worker 741*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in CharacterReader when parsing an input stream, could throw a Mark Invalid exception if the reader was 742*6da8f8c4SAndroid Build Coastguard Worker marked, a bufferUp occurred, and then the reader was rewound. 743*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1324> 744*6da8f8c4SAndroid Build Coastguard Worker 745*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: empty tags and form tags did not have their attributes normalized (lower-cased by default) 746*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1323> 747*6da8f8c4SAndroid Build Coastguard Worker 748*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when preserve case was set to on, the HTML pretty-print formatter didn't indent capitalized tags correctly. 749*6da8f8c4SAndroid Build Coastguard Worker 750*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: ensure that script and style contents are parsed into DataNodes, not TextNodes, when in case-sensitive 751*6da8f8c4SAndroid Build Coastguard Worker parse mode. 752*6da8f8c4SAndroid Build Coastguard Worker 753*6da8f8c4SAndroid Build Coastguard Worker**** Release 1.12.2 [2020-Feb-08] 754*6da8f8c4SAndroid Build Coastguard Worker * Improvement: the :has() selector now supports relative selectors. For example, the query 755*6da8f8c4SAndroid Build Coastguard Worker "div:has(> a)" will select all "div" elements that have at least one direct child "a" element. 756*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1214> 757*6da8f8c4SAndroid Build Coastguard Worker 758*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element chaining methods for various overridden methods on Node. 759*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1193> 760*6da8f8c4SAndroid Build Coastguard Worker 761*6da8f8c4SAndroid Build Coastguard Worker * Improvement: ensure HTTP keepalives work when fetching content via body() and bodyAsBytes(). 762*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1232> 763*6da8f8c4SAndroid Build Coastguard Worker 764*6da8f8c4SAndroid Build Coastguard Worker * Improvement: set the default max body size in Jsoup.Connection to 2MB (up from 1MB) so fewer people get trimmed 765*6da8f8c4SAndroid Build Coastguard Worker content if they have not set it, but still in sensible bounds. Also updated the default user-agent to improve 766*6da8f8c4SAndroid Build Coastguard Worker default compatibility. 767*6da8f8c4SAndroid Build Coastguard Worker 768*6da8f8c4SAndroid Build Coastguard Worker * Improvement: dramatic speed improvement when bulk inserting child nodes into an element (wrapping contents). 769*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1281> 770*6da8f8c4SAndroid Build Coastguard Worker 771*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element#childrenSize() as a convenience to get the size of an element's element children. 772*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1291> 773*6da8f8c4SAndroid Build Coastguard Worker 774*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in W3CDom.asString, allow the output mode to be specified as HTML or as XML. It will default to 775*6da8f8c4SAndroid Build Coastguard Worker checking the content, and automatically selecting. 776*6da8f8c4SAndroid Build Coastguard Worker 777*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added a Document#documentType() method, to get a doc's doctype. 778*6da8f8c4SAndroid Build Coastguard Worker 779*6da8f8c4SAndroid Build Coastguard Worker * Improvement: To DocumentType, added #name(), #publicID(), and #systemId() methods to fetch those fields. 780*6da8f8c4SAndroid Build Coastguard Worker 781*6da8f8c4SAndroid Build Coastguard Worker * Improvement: in W3CDom conversions from jsoup documents, retain the DocumentType, and be able to serialize it. 782*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1183> 783*6da8f8c4SAndroid Build Coastguard Worker 784*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: on pages fetch by Jsoup.Connection, a "Mark Invalid" exception might be incorrectly thrown, or the page may 785*6da8f8c4SAndroid Build Coastguard Worker miss some data. This occurred on larger pages when the file transfer was chunked, and an invalid HTML entity 786*6da8f8c4SAndroid Build Coastguard Worker happened to cross a chunk boundary. 787*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1218> 788*6da8f8c4SAndroid Build Coastguard Worker 789*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if duplicate attributes in an element exist, retain the first vs the last attribute with the same name. Case 790*6da8f8c4SAndroid Build Coastguard Worker aware (HTML case-insensitive names, XML are case-sensitive). 791*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1219> 792*6da8f8c4SAndroid Build Coastguard Worker 793*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: don't submit input type=button form elements. 794*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1231> 795*6da8f8c4SAndroid Build Coastguard Worker 796*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: handle error position reporting correctly and don't blow up in some edge cases. 797*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1251> 798*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1253> 799*6da8f8c4SAndroid Build Coastguard Worker 800*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: handle the ^= (starts with) selector correctly when the prefix starts with a space. 801*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1280> 802*6da8f8c4SAndroid Build Coastguard Worker 803*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: don't strip out zero-width-joiners (or zero-width-non-joiners) when normalizing text. That breaks combined 804*6da8f8c4SAndroid Build Coastguard Worker emoji (and other text semantics). ♂️ 805*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1269> 806*6da8f8c4SAndroid Build Coastguard Worker 807*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Evaluator.TagEndsWith (namespaced elements) and Tag disagreed in case-sensitivity. Now correctly matches 808*6da8f8c4SAndroid Build Coastguard Worker case-insensitively. 809*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1257> 810*6da8f8c4SAndroid Build Coastguard Worker 811*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Don't throw an exception if a selector ends in a space, just trim it. 812*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1274> 813*6da8f8c4SAndroid Build Coastguard Worker 814*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: HTML parser adds redundant text when parsing self-closing textarea. 815*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1220> 816*6da8f8c4SAndroid Build Coastguard Worker 817*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Don't add spurious whitespace or newlines to HTML or text for inline tags. 818*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1305> 819*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/731> 820*6da8f8c4SAndroid Build Coastguard Worker 821*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: TextNode.outerHtml() wouldn't normalize correctly without a parent. 822*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1309> 823*6da8f8c4SAndroid Build Coastguard Worker 824*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Removed binary input detection as it was causing too many false positives. 825*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1250> 826*6da8f8c4SAndroid Build Coastguard Worker 827*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when cloning a TextNode, if .attributes() was hit before the clone() method, the text value would only be a 828*6da8f8c4SAndroid Build Coastguard Worker shallow clone. 829*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1176> 830*6da8f8c4SAndroid Build Coastguard Worker 831*6da8f8c4SAndroid Build Coastguard Worker * Various code hygiene updates. 832*6da8f8c4SAndroid Build Coastguard Worker 833*6da8f8c4SAndroid Build Coastguard Worker**** Release 1.12.1 [2019-May-12] 834*6da8f8c4SAndroid Build Coastguard Worker * Change: removed deprecated method to disable TLS cert checking Connection.validateTLSCertificates(). 835*6da8f8c4SAndroid Build Coastguard Worker 836*6da8f8c4SAndroid Build Coastguard Worker * Change: some internal methods have been rearranged; if you extended any of the Jsoup internals you may need to make 837*6da8f8c4SAndroid Build Coastguard Worker updates. 838*6da8f8c4SAndroid Build Coastguard Worker 839*6da8f8c4SAndroid Build Coastguard Worker * Improvement: documents now remember their parser, so when later manipulating them, the correct HTML or XML tree 840*6da8f8c4SAndroid Build Coastguard Worker builder is reused, as are the parser settings like case preservation. 841*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/769> 842*6da8f8c4SAndroid Build Coastguard Worker 843*6da8f8c4SAndroid Build Coastguard Worker * Improvement: Jsoup now detects the character set of the input if specified in an XML Declaration, when using the 844*6da8f8c4SAndroid Build Coastguard Worker HTML parser. Previously that only happened when the XML parser was specified. 845*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1009> 846*6da8f8c4SAndroid Build Coastguard Worker 847*6da8f8c4SAndroid Build Coastguard Worker * Improvement: if the document's input character set does not support encoding, flip it to one that does. 848*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1007> 849*6da8f8c4SAndroid Build Coastguard Worker 850*6da8f8c4SAndroid Build Coastguard Worker * Improvement: if a start tag is missing a > and a new tag is seen with a <, treat that as a new tag. (This differs 851*6da8f8c4SAndroid Build Coastguard Worker from the HTML5 spec, which would make at attribute with a name beginning with <, but in practice this impacts too 852*6da8f8c4SAndroid Build Coastguard Worker many pages. 853*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/797> 854*6da8f8c4SAndroid Build Coastguard Worker 855*6da8f8c4SAndroid Build Coastguard Worker * Improvement: performance tweaks when parsing start tags, data, tables. 856*6da8f8c4SAndroid Build Coastguard Worker 857*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element.nextElementSiblings() and Element.previousElementSiblings() 858*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1054> 859*6da8f8c4SAndroid Build Coastguard Worker 860*6da8f8c4SAndroid Build Coastguard Worker * Improvement: treat center tags as block tags. 861*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1113> 862*6da8f8c4SAndroid Build Coastguard Worker 863*6da8f8c4SAndroid Build Coastguard Worker * Improvement: allow forms to be submitted with Content-Type=multipart/form-data without requiring a file upload; 864*6da8f8c4SAndroid Build Coastguard Worker automatically set the mime boundary. 865*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1058> 866*6da8f8c4SAndroid Build Coastguard Worker 867*6da8f8c4SAndroid Build Coastguard Worker * Improvement: Jsoup will now detect if an input file or URL is binary, and will refuse to attempt to parse it, with 868*6da8f8c4SAndroid Build Coastguard Worker an IO exception. This prevents runaway processing time and wasted effort creating meaningless parsed DOM trees. 869*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1192> 870*6da8f8c4SAndroid Build Coastguard Worker 871*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when using the tag case preserving parsing settings, certain HTML tree building rules where not followed 872*6da8f8c4SAndroid Build Coastguard Worker for upper case tags. 873*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1149> 874*6da8f8c4SAndroid Build Coastguard Worker 875*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when converting a Jsoup document to a W3C DOM, if an element is namespaced but not in a defined namespace, 876*6da8f8c4SAndroid Build Coastguard Worker set it to the global namespace. 877*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/848> 878*6da8f8c4SAndroid Build Coastguard Worker 879*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: attributes created with the Attribute constructor with just spaces for names would incorrectly pass 880*6da8f8c4SAndroid Build Coastguard Worker validation. 881*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1159> 882*6da8f8c4SAndroid Build Coastguard Worker 883*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: some pseudo XML Declarations were incorrectly handled when using the XML Parser, leading to an IOOB 884*6da8f8c4SAndroid Build Coastguard Worker exception when parsing. 885*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1139> 886*6da8f8c4SAndroid Build Coastguard Worker 887*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing URL parameter names in an attribute that is not correctly HTML encoded, and near the end of the 888*6da8f8c4SAndroid Build Coastguard Worker current buffer, those parameters may be incorrectly dropped. (Improved CharacterReader mark/reset support.) 889*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1154> 890*6da8f8c4SAndroid Build Coastguard Worker 891*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: boolean attribute values would be returned as null, vs an empty string, when accessed via the 892*6da8f8c4SAndroid Build Coastguard Worker Attribute#getValue() method. 893*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1065> 894*6da8f8c4SAndroid Build Coastguard Worker 895*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: orphan Attribute objects (i.e. created outside of a parse or an Element) would throw an NPE on 896*6da8f8c4SAndroid Build Coastguard Worker Attribute#setValue(val) 897*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1107> 898*6da8f8c4SAndroid Build Coastguard Worker 899*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Element.shallowClone() was not making a clone of its attributes. 900*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1201> 901*6da8f8c4SAndroid Build Coastguard Worker 902*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an ArrayIndexOutOfBoundsException in HttpConnection.looksLikeUtf8 when testing small strings in 903*6da8f8c4SAndroid Build Coastguard Worker specific ranges. 904*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1172> 905*6da8f8c4SAndroid Build Coastguard Worker 906*6da8f8c4SAndroid Build Coastguard Worker * Updated jetty-server (which is used for integration tests) to latest 9.2 series (9.2.28). 907*6da8f8c4SAndroid Build Coastguard Worker 908*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.11.3 [2018-Apr-15] 909*6da8f8c4SAndroid Build Coastguard Worker * Improvement: CDATA sections are now treated as whitespace preserving (regardless of the containing element), and are 910*6da8f8c4SAndroid Build Coastguard Worker round-tripped into output HTML. 911*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/406> 912*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/965> 913*6da8f8c4SAndroid Build Coastguard Worker 914*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added support for Deflate encoding. 915*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/982> 916*6da8f8c4SAndroid Build Coastguard Worker 917*6da8f8c4SAndroid Build Coastguard Worker * Improvement: when parsing <pre> tags, skip the first newline if present. 918*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/825> 919*6da8f8c4SAndroid Build Coastguard Worker 920*6da8f8c4SAndroid Build Coastguard Worker * Improvement: support nested quotes for attribute selection queries. 921*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/988> 922*6da8f8c4SAndroid Build Coastguard Worker 923*6da8f8c4SAndroid Build Coastguard Worker * Improvement: character references from Windows-1252 that are not valid Unicode are mapped to the appropriate 924*6da8f8c4SAndroid Build Coastguard Worker Unicode replacement. 925*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1046> 926*6da8f8c4SAndroid Build Coastguard Worker 927*6da8f8c4SAndroid Build Coastguard Worker * Improvement: accept a custom SSL socket factory in Jsoup.Connection. 928*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/1038> 929*6da8f8c4SAndroid Build Coastguard Worker 930*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: "Mark has been invalidated" exception was thrown when parsing some URLs on Android <= 6. 931*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/990> 932*6da8f8c4SAndroid Build Coastguard Worker 933*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: The Element.text() for <div>One</div>Two was "OneTwo", not "One Two". 934*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/812> 935*6da8f8c4SAndroid Build Coastguard Worker 936*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: boolean attributes with empty string values were not collapsing in HTML output. 937*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/985> 938*6da8f8c4SAndroid Build Coastguard Worker 939*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when using the XML Parser set to lowercase normalize tags, uppercase closing tags were not correctly 940*6da8f8c4SAndroid Build Coastguard Worker handled. 941*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/998> 942*6da8f8c4SAndroid Build Coastguard Worker 943*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing from a URL, an end tag could be read incorrectly if it started on a buffer boundary. 944*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/995> 945*6da8f8c4SAndroid Build Coastguard Worker 946*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing from a URL, if the remote server failed to complete its write (i.e. it writes less than the 947*6da8f8c4SAndroid Build Coastguard Worker Content Length header promised on a gzipped stream), the parse method would incorrectly throw an unchecked 948*6da8f8c4SAndroid Build Coastguard Worker exception. It now throws the declared IOException. 949*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/980> 950*6da8f8c4SAndroid Build Coastguard Worker 951*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: leaf nodes (such as text nodes) where throwing an unsupported operation exception on childNodes(), instead 952*6da8f8c4SAndroid Build Coastguard Worker of just returning an empty list. 953*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1032> 954*6da8f8c4SAndroid Build Coastguard Worker 955*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: documents with a leading UTF-8 BOM did not have that BOM consumed, so it acted as a zero width no-break 956*6da8f8c4SAndroid Build Coastguard Worker space, which could impact the parse tree. 957*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1003> 958*6da8f8c4SAndroid Build Coastguard Worker 959*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing an invalid XML declaration, the parse would fail. 960*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/1015> 961*6da8f8c4SAndroid Build Coastguard Worker 962*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.11.2 [2017-Nov-19] 963*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added a new pseudo selector :matchText, which allows text nodes to match as if they were elements. 964*6da8f8c4SAndroid Build Coastguard Worker This enables finding text that is only marked by a "br" tag, for example. 965*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/550> 966*6da8f8c4SAndroid Build Coastguard Worker 967*6da8f8c4SAndroid Build Coastguard Worker * Change: marked Connection.validateTLSCertificates() as deprecated. 968*6da8f8c4SAndroid Build Coastguard Worker 969*6da8f8c4SAndroid Build Coastguard Worker * Improvement: normalize invisible characters (like soft-hyphens) in Element.text(). 970*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/978> 971*6da8f8c4SAndroid Build Coastguard Worker 972*6da8f8c4SAndroid Build Coastguard Worker * Improvement: added Element.wholeText(), to easily get the un-normalized text value of an element and its children. 973*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/564> 974*6da8f8c4SAndroid Build Coastguard Worker 975*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in a deep DOM stack, a StackOverFlow exception could occur when generating implied end tags. 976*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/966> 977*6da8f8c4SAndroid Build Coastguard Worker 978*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing attribute values that happened to cross a buffer boundary, a character was dropped. 979*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/967> 980*6da8f8c4SAndroid Build Coastguard Worker 981*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an issue that prevented using infinite timeouts in Jsoup.Connection. 982*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/968> 983*6da8f8c4SAndroid Build Coastguard Worker 984*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: whitespace preserving tags were not honoured when nested deeper than two levels deep. 985*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/722> 986*6da8f8c4SAndroid Build Coastguard Worker 987*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: an unterminated comment token at the end of the HTML input would cause an out of bounds exception. 988*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/972> 989*6da8f8c4SAndroid Build Coastguard Worker 990*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: an NPE in the Cleaner which would occur if an <a href> attribute value was missing. 991*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/973> 992*6da8f8c4SAndroid Build Coastguard Worker 993*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when serializing the same document in a multiple threads, on Android, with a character set that is not ascii 994*6da8f8c4SAndroid Build Coastguard Worker or UTF-8, an encoding exception could occur. 995*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/970> 996*6da8f8c4SAndroid Build Coastguard Worker 997*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: removing a form value from the DOM would not remove it from FormData. 998*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/969> 999*6da8f8c4SAndroid Build Coastguard Worker 1000*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in the W3CDom transformer, siblings were incorrectly inheriting namespaces defined on previous siblings. 1001*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/977> 1002*6da8f8c4SAndroid Build Coastguard Worker 1003*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.11.1 [2017-Nov-06] 1004*6da8f8c4SAndroid Build Coastguard Worker * Updated language level to Java 7 from Java 5. To maintain Android support (of minversion 8), try-with-resources are 1005*6da8f8c4SAndroid Build Coastguard Worker not used. 1006*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/899> 1007*6da8f8c4SAndroid Build Coastguard Worker 1008*6da8f8c4SAndroid Build Coastguard Worker * When loading content from a URL or a file, the content is now parsed as it streams in from the network or disk, 1009*6da8f8c4SAndroid Build Coastguard Worker rather than being fully buffered before parsing. This substantially reduces memory consumption & large garbage 1010*6da8f8c4SAndroid Build Coastguard Worker objects when loading large files. Note that this change means that a response, once parsed, may not be parsed 1011*6da8f8c4SAndroid Build Coastguard Worker again from the same response object unless you call response.bufferUp() first, which will buffer the full response 1012*6da8f8c4SAndroid Build Coastguard Worker into memory. 1013*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/904> 1014*6da8f8c4SAndroid Build Coastguard Worker 1015*6da8f8c4SAndroid Build Coastguard Worker * Added Connection.Response.bodyStream(), a method to get the response body as an input stream. This is useful for 1016*6da8f8c4SAndroid Build Coastguard Worker saving a large response straight to a file, without buffering fully into memory first. 1017*6da8f8c4SAndroid Build Coastguard Worker 1018*6da8f8c4SAndroid Build Coastguard Worker * Performance improvements in text and HTML generation (through less GC). 1019*6da8f8c4SAndroid Build Coastguard Worker 1020*6da8f8c4SAndroid Build Coastguard Worker * Reduced memory consumption of text, scripts, and comments in the DOM by 40%, by refactoring the node 1021*6da8f8c4SAndroid Build Coastguard Worker hierarchy to not track childnodes or attributes by default for lead nodes. For the average document, that's about a 1022*6da8f8c4SAndroid Build Coastguard Worker 30% memory reduction. 1023*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/911> 1024*6da8f8c4SAndroid Build Coastguard Worker 1025*6da8f8c4SAndroid Build Coastguard Worker * Reduced memory consumption of Elements by refactoring their Attributes to be a simple pair of arrays, vs a 1026*6da8f8c4SAndroid Build Coastguard Worker LinkedHashSet. 1027*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/911> 1028*6da8f8c4SAndroid Build Coastguard Worker 1029*6da8f8c4SAndroid Build Coastguard Worker * Added support for Element.selectFirst(query), to efficiently find the first matching element. 1030*6da8f8c4SAndroid Build Coastguard Worker 1031*6da8f8c4SAndroid Build Coastguard Worker * Added Element.appendTo(parent) to simplify slinging elements about. 1032*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/662> 1033*6da8f8c4SAndroid Build Coastguard Worker 1034*6da8f8c4SAndroid Build Coastguard Worker * Added support for multiple headers with the same name in Jsoup.Connect 1035*6da8f8c4SAndroid Build Coastguard Worker 1036*6da8f8c4SAndroid Build Coastguard Worker * Added Element.shallowClone() and Node.shallowClone(), to allow cloning nodes without getting all their children. 1037*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/900> 1038*6da8f8c4SAndroid Build Coastguard Worker 1039*6da8f8c4SAndroid Build Coastguard Worker * Updated Element.text() and the :contains(text) selector to consider character as spaces. 1040*6da8f8c4SAndroid Build Coastguard Worker 1041*6da8f8c4SAndroid Build Coastguard Worker * Updated Jsoup.connect().timeout() to implement a total connect + combined read timeout. Previously it specified 1042*6da8f8c4SAndroid Build Coastguard Worker connect and buffer read times only, so to implement a combined total timeout, you had to have another thread send 1043*6da8f8c4SAndroid Build Coastguard Worker an interrupt. 1044*6da8f8c4SAndroid Build Coastguard Worker 1045*6da8f8c4SAndroid Build Coastguard Worker * Improved performance of Node.addChildren (was quadratic) 1046*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/930> 1047*6da8f8c4SAndroid Build Coastguard Worker 1048*6da8f8c4SAndroid Build Coastguard Worker * Added missing support for template tags in tables 1049*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/901> 1050*6da8f8c4SAndroid Build Coastguard Worker 1051*6da8f8c4SAndroid Build Coastguard Worker * In Jsoup.connect file uploads, added the ability to set the uploaded files' mimetype. 1052*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/936> 1053*6da8f8c4SAndroid Build Coastguard Worker 1054*6da8f8c4SAndroid Build Coastguard Worker * Improved Node traversal, including less object creation, and partial and filtering traversor support. 1055*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/849> 1056*6da8f8c4SAndroid Build Coastguard Worker 1057*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if a document was re-decoded after character set detection, the HTML parser was not reset correctly, 1058*6da8f8c4SAndroid Build Coastguard Worker which could lead to an incorrect DOM. 1059*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/877> 1060*6da8f8c4SAndroid Build Coastguard Worker 1061*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: attributes with the same name but different case would be incorrectly treated as different attributes. 1062*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/903> 1063*6da8f8c4SAndroid Build Coastguard Worker 1064*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: self-closing tags for known empty elements were incorrectly treated as errors. 1065*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/868> 1066*6da8f8c4SAndroid Build Coastguard Worker 1067*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an issue where a self-closing title, noframes, or style tag would cause the rest of the page to be 1068*6da8f8c4SAndroid Build Coastguard Worker incorrectly parsed as data or text. 1069*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/906> 1070*6da8f8c4SAndroid Build Coastguard Worker 1071*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an issue with unknown mixed-case tags 1072*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/942> 1073*6da8f8c4SAndroid Build Coastguard Worker 1074*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an issue where the entity resources were left open after startup, causing a warning. 1075*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/928> 1076*6da8f8c4SAndroid Build Coastguard Worker 1077*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: fixed an issue where Element.getElementsByIndexLessThan(index) would incorrectly provide the root element 1078*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/918> 1079*6da8f8c4SAndroid Build Coastguard Worker 1080*6da8f8c4SAndroid Build Coastguard Worker * Improved parse time for pages with exceptionally deeply nested tags. 1081*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/955> 1082*6da8f8c4SAndroid Build Coastguard Worker 1083*6da8f8c4SAndroid Build Coastguard Worker * Improvement / workaround: modified the Entities implementation to load its data from a .class vs from a jar resource. 1084*6da8f8c4SAndroid Build Coastguard Worker Faster, and safer on Android. 1085*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/959> 1086*6da8f8c4SAndroid Build Coastguard Worker 1087*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.10.3 [2017-Jun-11] 1088*6da8f8c4SAndroid Build Coastguard Worker * Added Elements.eachText() and Elements.eachAttr(name), which return a list of Element's text or attribute values, 1089*6da8f8c4SAndroid Build Coastguard Worker respectively. This makes it simpler to for example get a list of each URL on a page: 1090*6da8f8c4SAndroid Build Coastguard Worker List<String> urls = doc.select("a").eachAttr("abs:href""); 1091*6da8f8c4SAndroid Build Coastguard Worker 1092*6da8f8c4SAndroid Build Coastguard Worker * Improved selector validation for :contains(...) with unbalanced quotes. 1093*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/803> 1094*6da8f8c4SAndroid Build Coastguard Worker 1095*6da8f8c4SAndroid Build Coastguard Worker * Improved the speed of index based CSS selectors and other methods that use elementSiblingIndex, by a factor of 34x. 1096*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/862> 1097*6da8f8c4SAndroid Build Coastguard Worker 1098*6da8f8c4SAndroid Build Coastguard Worker * Added Node.clearAttributes(), to simplify removing of all attributes of a Node / Element. 1099*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/829> 1100*6da8f8c4SAndroid Build Coastguard Worker 1101*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if an attribute name started or ended with a control character, the parse would fail with a validation 1102*6da8f8c4SAndroid Build Coastguard Worker exception. 1103*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/793> 1104*6da8f8c4SAndroid Build Coastguard Worker 1105*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: Element.hasClass() and the ".classname" selector would not find the class attribute case-insensitively. 1106*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/814> 1107*6da8f8c4SAndroid Build Coastguard Worker 1108*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: In Jsoup.Connection, if a redirect contained a query string with %xx escapes, they would be double escaped 1109*6da8f8c4SAndroid Build Coastguard Worker before the redirect was followed, leading to fetching an incorrect location. 1110*6da8f8c4SAndroid Build Coastguard Worker 1111*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: In Jsoup.Connection, if a request body was set and the connection was redirected, the body would incorrectly 1112*6da8f8c4SAndroid Build Coastguard Worker still be sent. 1113*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/881> 1114*6da8f8c4SAndroid Build Coastguard Worker 1115*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: In DataUtil when detecting the character set from meta data, and there are two Content-Types defined, use 1116*6da8f8c4SAndroid Build Coastguard Worker the one that defines a character set. 1117*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/835> 1118*6da8f8c4SAndroid Build Coastguard Worker 1119*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: when parsing unknown tags in case-sensitive HTML mode, end tags would not close scope correctly. 1120*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/819> 1121*6da8f8c4SAndroid Build Coastguard Worker 1122*6da8f8c4SAndroid Build Coastguard Worker * In Jsoup.Connection, ensure there is no Content-Type set when being redirected to a GET. 1123*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/895> 1124*6da8f8c4SAndroid Build Coastguard Worker 1125*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in certain locales (Turkey specifically), lowercasing and case insensitivity could fail for specific items. 1126*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/820> 1127*6da8f8c4SAndroid Build Coastguard Worker 1128*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: after an element was cloned, changes to its child list where not notifying the element correctly. 1129*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/951> 1130*6da8f8c4SAndroid Build Coastguard Worker 1131*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.10.2 [2017-Jan-02] 1132*6da8f8c4SAndroid Build Coastguard Worker * Improved startup time, particularly on Android, by reducing garbage generation and CPU execution time when loading 1133*6da8f8c4SAndroid Build Coastguard Worker the HTML entity files. About 1.72x faster in this area. 1134*6da8f8c4SAndroid Build Coastguard Worker 1135*6da8f8c4SAndroid Build Coastguard Worker * Added Element.is(query) to check if an element matches this CSS query. 1136*6da8f8c4SAndroid Build Coastguard Worker 1137*6da8f8c4SAndroid Build Coastguard Worker * Added new methods to Elements: next(query), nextAll(query), prev(query), prevAll(query) to select next and previous 1138*6da8f8c4SAndroid Build Coastguard Worker element siblings from a current selection, with optional selectors. 1139*6da8f8c4SAndroid Build Coastguard Worker 1140*6da8f8c4SAndroid Build Coastguard Worker * Added Node.root() to get the topmost ancestor of a Node. 1141*6da8f8c4SAndroid Build Coastguard Worker 1142*6da8f8c4SAndroid Build Coastguard Worker * Added the new selector :containsData(), to find elements that hold data, like script and style tags. 1143*6da8f8c4SAndroid Build Coastguard Worker 1144*6da8f8c4SAndroid Build Coastguard Worker * Changed Jsoup.isValid(bodyHtml) to validate that the input contains only body HTML that is safe according to the 1145*6da8f8c4SAndroid Build Coastguard Worker safelist, and does not include HTML errors. And in the Jsoup.Cleaner.isValid(Document) method, make sure the doc 1146*6da8f8c4SAndroid Build Coastguard Worker only includes body HTML. 1147*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/245> 1148*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/632> 1149*6da8f8c4SAndroid Build Coastguard Worker 1150*6da8f8c4SAndroid Build Coastguard Worker * In Safelists, validate that a removed protocol exists before removing said protocol. 1151*6da8f8c4SAndroid Build Coastguard Worker 1152*6da8f8c4SAndroid Build Coastguard Worker * Allow the Jsoup.Connect thread to be interrupted when reading the input stream; helps when reading from a long stream 1153*6da8f8c4SAndroid Build Coastguard Worker of data that doesn't read timeout. 1154*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/712> 1155*6da8f8c4SAndroid Build Coastguard Worker 1156*6da8f8c4SAndroid Build Coastguard Worker * Jsoup.Connect now uses a desktop user agent by default. Many developers were getting caught by not specifying the 1157*6da8f8c4SAndroid Build Coastguard Worker user agent, and sending the default 'Java'. That causes many servers to return different content than what they would 1158*6da8f8c4SAndroid Build Coastguard Worker to a desktop browser, and what the developer was expecting. 1159*6da8f8c4SAndroid Build Coastguard Worker 1160*6da8f8c4SAndroid Build Coastguard Worker * Increased the default connect/read timeout in Jsoup.Connect to 30 seconds. 1161*6da8f8c4SAndroid Build Coastguard Worker 1162*6da8f8c4SAndroid Build Coastguard Worker * Jsoup.Connect now detects if a header value is actually in UTF-8 vs the HTTP spec of ISO-8859, and converts 1163*6da8f8c4SAndroid Build Coastguard Worker the header value appropriately. This improves compatibility with servers that are configured incorrectly. 1164*6da8f8c4SAndroid Build Coastguard Worker 1165*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: in Jsoup.Connect, URLs containing non-URL-safe characters were not encoded to URL safe correctly. 1166*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/706> 1167*6da8f8c4SAndroid Build Coastguard Worker 1168*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: a "SYSTEM" flag in doctype tags would be incorrectly removed. 1169*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/408> 1170*6da8f8c4SAndroid Build Coastguard Worker 1171*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: removing attributes from an Element with removeAttr() would cause a ConcurrentModificationException. 1172*6da8f8c4SAndroid Build Coastguard Worker 1173*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: the contents of Comment nodes were not returned by Element.data() 1174*6da8f8c4SAndroid Build Coastguard Worker 1175*6da8f8c4SAndroid Build Coastguard Worker * Bugfix: if source checked out on Windows with git autocrlf=true, Entities.load would fail because of the \r char. 1176*6da8f8c4SAndroid Build Coastguard Worker 1177*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.10.1 [2016-Oct-23] 1178*6da8f8c4SAndroid Build Coastguard Worker * New feature: added the option to preserve case for tags and/or attributes, with ParseSettings. By default, the HTML 1179*6da8f8c4SAndroid Build Coastguard Worker parser will continue to normalize tag names and attribute names to lower case, and the XML parser will now preserve 1180*6da8f8c4SAndroid Build Coastguard Worker case, according to the relevant spec. The CSS selectors for tags and attributes remain case insensitive, per the CSS 1181*6da8f8c4SAndroid Build Coastguard Worker spec. 1182*6da8f8c4SAndroid Build Coastguard Worker 1183*6da8f8c4SAndroid Build Coastguard Worker * Improved support for extended HTML entities, including supplemental characters and multiple character references. 1184*6da8f8c4SAndroid Build Coastguard Worker Also reduced memory consumption of the entity tables. 1185*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/602> 1186*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/603> 1187*6da8f8c4SAndroid Build Coastguard Worker 1188*6da8f8c4SAndroid Build Coastguard Worker * Added support for *|E wildcard namespace selectors. 1189*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/724> 1190*6da8f8c4SAndroid Build Coastguard Worker 1191*6da8f8c4SAndroid Build Coastguard Worker * Added support for setting multiple connection headers at once with Connection.headers(Map) 1192*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/725> 1193*6da8f8c4SAndroid Build Coastguard Worker 1194*6da8f8c4SAndroid Build Coastguard Worker * Added support for setting/overriding the response character set in Connection.Response, for cases where the charset 1195*6da8f8c4SAndroid Build Coastguard Worker is not defined by the server, or is defined incorrectly. 1196*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/743> 1197*6da8f8c4SAndroid Build Coastguard Worker 1198*6da8f8c4SAndroid Build Coastguard Worker * Improved performance of class selectors by reducing memory allocation and garbage collection. 1199*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/753> 1200*6da8f8c4SAndroid Build Coastguard Worker 1201*6da8f8c4SAndroid Build Coastguard Worker * Improved performance of HTML output by reducing the creation of temporary attribute list iterators. 1202*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/755> 1203*6da8f8c4SAndroid Build Coastguard Worker 1204*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when converting to the W3CDom XML, where valid (but ugly) HTML attribute names containing characters 1205*6da8f8c4SAndroid Build Coastguard Worker like '"' could not be converted into valid XML attribute names. These attribute names are now normalized if possible, 1206*6da8f8c4SAndroid Build Coastguard Worker or not added to the XML DOM. 1207*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/721> 1208*6da8f8c4SAndroid Build Coastguard Worker 1209*6da8f8c4SAndroid Build Coastguard Worker * Fixed an OOB exception when loading an empty-body URL and parsing with the XML parser. 1210*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/727> 1211*6da8f8c4SAndroid Build Coastguard Worker 1212*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where attribute names starting with a slash would be parsed incorrectly. 1213*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/748> 1214*6da8f8c4SAndroid Build Coastguard Worker 1215*6da8f8c4SAndroid Build Coastguard Worker * Don't reuse charset encoders from OutputSettings, to make threadsafe. 1216*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/740> 1217*6da8f8c4SAndroid Build Coastguard Worker 1218*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue in connections with a requestBody where a custom content-type header could be ignored. 1219*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/756> 1220*6da8f8c4SAndroid Build Coastguard Worker 1221*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.9.2 [2016-May-17] 1222*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where tag names that contained non-ascii characters but started with an ascii character 1223*6da8f8c4SAndroid Build Coastguard Worker would cause the parser to get stuck in an infinite loop. 1224*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/704> 1225*6da8f8c4SAndroid Build Coastguard Worker 1226*6da8f8c4SAndroid Build Coastguard Worker * In XML documents, detect the charset from the XML prolog - <?xml encoding="UTF-8"?> 1227*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/701> 1228*6da8f8c4SAndroid Build Coastguard Worker 1229*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where created XML documents would have an incorrect prolog. 1230*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/652> 1231*6da8f8c4SAndroid Build Coastguard Worker 1232*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where you could not use an attribute selector to find values containing unbalanced braces or 1233*6da8f8c4SAndroid Build Coastguard Worker parentheses. 1234*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/611> 1235*6da8f8c4SAndroid Build Coastguard Worker 1236*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where namespaced tags (like <fb:comment>) would cause Element.cssSelector() to fail. 1237*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/677> 1238*6da8f8c4SAndroid Build Coastguard Worker 1239*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.9.1 [2016-Apr-16] 1240*6da8f8c4SAndroid Build Coastguard Worker * Added support for HTTP and SOCKS request proxies, specifiable per connection. 1241*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/570> 1242*6da8f8c4SAndroid Build Coastguard Worker 1243*6da8f8c4SAndroid Build Coastguard Worker * Added support for sending plain HTTP request bodies in POST and PUT requests, with Connection.requestBody(String). 1244*6da8f8c4SAndroid Build Coastguard Worker 1245*6da8f8c4SAndroid Build Coastguard Worker * Added support in Jsoup.Connect for HEAD, OPTIONS, TRACE. 1246*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/613> 1247*6da8f8c4SAndroid Build Coastguard Worker 1248*6da8f8c4SAndroid Build Coastguard Worker * Added support for HTTP 307 Temporary Redirect (replays posts, if applicable). 1249*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/666> 1250*6da8f8c4SAndroid Build Coastguard Worker 1251*6da8f8c4SAndroid Build Coastguard Worker * Performance improvements when parsing HTML, particularly for Android Dalvik. 1252*6da8f8c4SAndroid Build Coastguard Worker 1253*6da8f8c4SAndroid Build Coastguard Worker * Added support for writing HTML into Appendable objects (like OutputStreamWriter), to enable stream serialization. 1254*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/470/> 1255*6da8f8c4SAndroid Build Coastguard Worker 1256*6da8f8c4SAndroid Build Coastguard Worker * Added support for XML namespaces when converting jsoup documents to W3C documents. 1257*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/672> 1258*6da8f8c4SAndroid Build Coastguard Worker 1259*6da8f8c4SAndroid Build Coastguard Worker * Added support for UTF-16 and UTF-32 character set detection from byte-order-marks (BOM). 1260*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/695> 1261*6da8f8c4SAndroid Build Coastguard Worker 1262*6da8f8c4SAndroid Build Coastguard Worker * Added support for tags with non-ascii (unicode) letters. 1263*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/667> 1264*6da8f8c4SAndroid Build Coastguard Worker 1265*6da8f8c4SAndroid Build Coastguard Worker * Added Connection.data(key) to retrieve a data KeyVal by its key. Useful to update form data before submission. 1266*6da8f8c4SAndroid Build Coastguard Worker 1267*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue in the Parent selector where it would not match against the root element it was applied to. 1268*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/619> 1269*6da8f8c4SAndroid Build Coastguard Worker 1270*6da8f8c4SAndroid Build Coastguard Worker * Fix an issue where elements.select(query) would not return every matching element if they had the same content. 1271*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/614> 1272*6da8f8c4SAndroid Build Coastguard Worker 1273*6da8f8c4SAndroid Build Coastguard Worker * Added not-null validators to Element.appendText() and Element.prependText() 1274*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/690> 1275*6da8f8c4SAndroid Build Coastguard Worker 1276*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when moving nodes using Element.insert(index, children) where the sibling index would be set 1277*6da8f8c4SAndroid Build Coastguard Worker incorrectly, leading to the original loads being lost. 1278*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/689> 1279*6da8f8c4SAndroid Build Coastguard Worker 1280*6da8f8c4SAndroid Build Coastguard Worker * Reverted Node.equals() and Node.hashCode() back to identity (object) comparisons, as deep content inspection 1281*6da8f8c4SAndroid Build Coastguard Worker had negative performance impacts and hashkey stability problems. Functionality replaced with Node.hasSameContent(). 1282*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/688> 1283*6da8f8c4SAndroid Build Coastguard Worker 1284*6da8f8c4SAndroid Build Coastguard Worker * In Jsoup.Connect, if the same header key is seen multiple times, combine their values with a comma per the HTTP RFC, 1285*6da8f8c4SAndroid Build Coastguard Worker instead of keeping just one value. Also fixes an issue where header values could be out of order. 1286*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/618> 1287*6da8f8c4SAndroid Build Coastguard Worker 1288*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.8.3 [2015-Aug-02] 1289*6da8f8c4SAndroid Build Coastguard Worker * Added support for custom boolean attributes. 1290*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/555> 1291*6da8f8c4SAndroid Build Coastguard Worker 1292*6da8f8c4SAndroid Build Coastguard Worker * When fetching XML URLs, automatically switch to the XML parser instead of the HTML parser. 1293*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/574> 1294*6da8f8c4SAndroid Build Coastguard Worker 1295*6da8f8c4SAndroid Build Coastguard Worker * Performance improvement on parsing larger HTML pages. On Android KitKat, around 1.7x times faster. On Android 1296*6da8f8c4SAndroid Build Coastguard Worker Lollipop, ~ 1.3x faster. Improvements largely from re-ordering the HtmlTreeBuilder methods based on analysis of 1297*6da8f8c4SAndroid Build Coastguard Worker various websites; also from further memory reduction for nodes with no children, and other tweaks. 1298*6da8f8c4SAndroid Build Coastguard Worker 1299*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue in Element.getElementSiblingIndex (and related methods) where sibling elements with the same content 1300*6da8f8c4SAndroid Build Coastguard Worker would incorrectly have the same sibling index. 1301*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/554> 1302*6da8f8c4SAndroid Build Coastguard Worker 1303*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where unexpected elements in a badly nested table could be moved to the wrong location in the 1304*6da8f8c4SAndroid Build Coastguard Worker document. 1305*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/552> 1306*6da8f8c4SAndroid Build Coastguard Worker 1307*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where a table nested within a TH cell would parse to an incorrect tree. 1308*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/575> 1309*6da8f8c4SAndroid Build Coastguard Worker 1310*6da8f8c4SAndroid Build Coastguard Worker * When serializing a document using the XHTML encoding entities, if the character set did not support chars 1311*6da8f8c4SAndroid Build Coastguard Worker (such as Shift_JIS), the character would be skipped. For visibility, will now always output &xa0; when using XHTML 1312*6da8f8c4SAndroid Build Coastguard Worker encoding entities (as is not defined), regardless of the output character set. 1313*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/523> 1314*6da8f8c4SAndroid Build Coastguard Worker 1315*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when resolving URLs, if the absolute URL had no path, the relative URL was not normalized correctly. 1316*6da8f8c4SAndroid Build Coastguard Worker Also fixed an issue where connections that were redirected to a relative URL did not have the same normalization 1317*6da8f8c4SAndroid Build Coastguard Worker rules as a URL read from Nodes.absUrl(String). 1318*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/585> 1319*6da8f8c4SAndroid Build Coastguard Worker 1320*6da8f8c4SAndroid Build Coastguard Worker * When serialising XML, ensure that '<' characters in attributes are escaped, per spec. Not required in HTML. 1321*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/528> 1322*6da8f8c4SAndroid Build Coastguard Worker 1323*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.8.2 [2015-Apr-13] 1324*6da8f8c4SAndroid Build Coastguard Worker * Performance improvements for parsing HTML on Android, of 1.5x to 1.9x, with larger parses getting a bigger 1325*6da8f8c4SAndroid Build Coastguard Worker speed increase. For non-Android JREs, around 1.1x to 1.2x. 1326*6da8f8c4SAndroid Build Coastguard Worker 1327*6da8f8c4SAndroid Build Coastguard Worker * Dramatic performance improvement in HTML serialization on Android (KitKat and later), of 115x. Improvement by working 1328*6da8f8c4SAndroid Build Coastguard Worker around a character set encoding speed regression in Android. 1329*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/383> 1330*6da8f8c4SAndroid Build Coastguard Worker 1331*6da8f8c4SAndroid Build Coastguard Worker * Performance improvement for the class name selector on Android (.class) of 2.5x to 14x. Around 1.2x 1332*6da8f8c4SAndroid Build Coastguard Worker on non-Android JREs. 1333*6da8f8c4SAndroid Build Coastguard Worker 1334*6da8f8c4SAndroid Build Coastguard Worker * File upload support. Added the ability to specify input streams for POST data, which will upload content in 1335*6da8f8c4SAndroid Build Coastguard Worker MIME multipart/form-data encoding. 1336*6da8f8c4SAndroid Build Coastguard Worker 1337*6da8f8c4SAndroid Build Coastguard Worker * Add a meta-charset element to documents when setting the character set, so that the document's charset is 1338*6da8f8c4SAndroid Build Coastguard Worker unambiguous. 1339*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/486> 1340*6da8f8c4SAndroid Build Coastguard Worker 1341*6da8f8c4SAndroid Build Coastguard Worker * Added ability to disable TLS (SSL) certificate validation. Helpful if you're hitting a host with a bad cert, 1342*6da8f8c4SAndroid Build Coastguard Worker or your JDK doesn't support SNI. 1343*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/343> 1344*6da8f8c4SAndroid Build Coastguard Worker 1345*6da8f8c4SAndroid Build Coastguard Worker * Added ability to further tweak the canned Cleaner Safelists by removing existing settings. 1346*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/449> 1347*6da8f8c4SAndroid Build Coastguard Worker 1348*6da8f8c4SAndroid Build Coastguard Worker * Added option in Cleaner Safelist to allow linking to in-page anchors (#) 1349*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/441> 1350*6da8f8c4SAndroid Build Coastguard Worker 1351*6da8f8c4SAndroid Build Coastguard Worker * Use a lowercase doctype tag for HTML5 documents. 1352*6da8f8c4SAndroid Build Coastguard Worker 1353*6da8f8c4SAndroid Build Coastguard Worker * Add support for 201 Created with redirect, and other status codes. Treats any HTTP status code 2xx or 3xx as an OK 1354*6da8f8c4SAndroid Build Coastguard Worker response, and follow redirects whenever there is a Location header. 1355*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/312> 1356*6da8f8c4SAndroid Build Coastguard Worker 1357*6da8f8c4SAndroid Build Coastguard Worker * Added support for HTTP method verbs PUT, DELETE, and PATCH. 1358*6da8f8c4SAndroid Build Coastguard Worker 1359*6da8f8c4SAndroid Build Coastguard Worker * Added support for overriding the default POST character of UTF-8 1360*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/491> 1361*6da8f8c4SAndroid Build Coastguard Worker 1362*6da8f8c4SAndroid Build Coastguard Worker * W3C DOM support: added ability to convert from a jsoup document to a W3C document, with the W3Dom helper class. 1363*6da8f8c4SAndroid Build Coastguard Worker 1364*6da8f8c4SAndroid Build Coastguard Worker * In the HtmlToPlainText example program, added the ability to filter using a CSS selector. Also clarified 1365*6da8f8c4SAndroid Build Coastguard Worker the usage documentation. 1366*6da8f8c4SAndroid Build Coastguard Worker 1367*6da8f8c4SAndroid Build Coastguard Worker * Fixed validation of cookie names in HttpConnection cookie methods. 1368*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/377> 1369*6da8f8c4SAndroid Build Coastguard Worker 1370*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where <option> tags would be missed when preparing a form for submission if missing a selected 1371*6da8f8c4SAndroid Build Coastguard Worker attribute. 1372*6da8f8c4SAndroid Build Coastguard Worker 1373*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where submitting a form would incorrectly include radio and checkbox values without the checked 1374*6da8f8c4SAndroid Build Coastguard Worker attribute. 1375*6da8f8c4SAndroid Build Coastguard Worker 1376*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where Element.classNames() would return a set containing an empty class; and may have extraneous 1377*6da8f8c4SAndroid Build Coastguard Worker whitespace. 1378*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/469> 1379*6da8f8c4SAndroid Build Coastguard Worker 1380*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where attributes selected by value were not correctly space normalized. 1381*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/526> 1382*6da8f8c4SAndroid Build Coastguard Worker 1383*6da8f8c4SAndroid Build Coastguard Worker * In head+noscript elements, treat content as character data, instead of jumping out of head parsing. 1384*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/540> 1385*6da8f8c4SAndroid Build Coastguard Worker 1386*6da8f8c4SAndroid Build Coastguard Worker * Fixed performance issue when parsing HTML with elements with many children that need re-parenting. 1387*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/506> 1388*6da8f8c4SAndroid Build Coastguard Worker 1389*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where a server returning an unsupported character set response would cause a runtime 1390*6da8f8c4SAndroid Build Coastguard Worker UnsupportedCharsetException, instead of falling back to the default UTF-8 charset. 1391*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/509> 1392*6da8f8c4SAndroid Build Coastguard Worker 1393*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where Jsoup.Connection would throw an IO Exception when reading a page with zero content-length. 1394*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/538> 1395*6da8f8c4SAndroid Build Coastguard Worker 1396*6da8f8c4SAndroid Build Coastguard Worker * Improved the equals() and hashcode() methods in Node, to consider all their child content, for DOM tree comparisons. 1397*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/537> 1398*6da8f8c4SAndroid Build Coastguard Worker 1399*6da8f8c4SAndroid Build Coastguard Worker * Improved performance in Selector when searching multiple roots. 1400*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/518> 1401*6da8f8c4SAndroid Build Coastguard Worker 1402*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.8.1 [2014-Sep-27] 1403*6da8f8c4SAndroid Build Coastguard Worker * Introduced the ability to chose between HTML and XML output, and made HTML the default. This means img tags are 1404*6da8f8c4SAndroid Build Coastguard Worker output as <img>, not <img />. XML is the default when using the XmlTreeBuilder. Control this with the 1405*6da8f8c4SAndroid Build Coastguard Worker Document.OutputSettings.syntax() method. 1406*6da8f8c4SAndroid Build Coastguard Worker 1407*6da8f8c4SAndroid Build Coastguard Worker * Improved the performance of Element.text() by 3.2x 1408*6da8f8c4SAndroid Build Coastguard Worker 1409*6da8f8c4SAndroid Build Coastguard Worker * Improved the performance of Element.html() by 1.7x 1410*6da8f8c4SAndroid Build Coastguard Worker 1411*6da8f8c4SAndroid Build Coastguard Worker * Improved file read time by 2x, giving around a 10% speed improvement to file parses. 1412*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/248> 1413*6da8f8c4SAndroid Build Coastguard Worker 1414*6da8f8c4SAndroid Build Coastguard Worker * Tightened the scope of what characters are escaped in attributes and textnodes, to align with the spec. Also, when 1415*6da8f8c4SAndroid Build Coastguard Worker using the extended escape entities map, only escape a character if the current output charset does not support it. 1416*6da8f8c4SAndroid Build Coastguard Worker This produces smaller, more legible HTML, with greater control over the output (by setting charset and escape mode). 1417*6da8f8c4SAndroid Build Coastguard Worker 1418*6da8f8c4SAndroid Build Coastguard Worker * If pretty-print is disabled, don't trim outer whitespace in Element.html() 1419*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/368> 1420*6da8f8c4SAndroid Build Coastguard Worker 1421*6da8f8c4SAndroid Build Coastguard Worker * In the HTML Cleaner, allow span tags in the basic safelist, and span and div tags in the relaxed safelist. 1422*6da8f8c4SAndroid Build Coastguard Worker 1423*6da8f8c4SAndroid Build Coastguard Worker * Added Element.cssSelector(), which returns a unique CSS selector/path for an element. 1424*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/459> 1425*6da8f8c4SAndroid Build Coastguard Worker 1426*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where <svg><img/></svg> was parsed as <svg><image/></svg> 1427*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/364> 1428*6da8f8c4SAndroid Build Coastguard Worker 1429*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where a UTF-8 BOM character was not detected if the HTTP response did not specify a charset, and 1430*6da8f8c4SAndroid Build Coastguard Worker the HTML body did, leading to the head contents incorrectly being parsed into the body. Changed the behavior so that 1431*6da8f8c4SAndroid Build Coastguard Worker when the UTF-8 BOM is detected, it will take precedence for determining the charset to decode with. 1432*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/348> 1433*6da8f8c4SAndroid Build Coastguard Worker 1434*6da8f8c4SAndroid Build Coastguard Worker * Relaxed doctype validation, allowing doctypes to not specify a name. 1435*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/460> 1436*6da8f8c4SAndroid Build Coastguard Worker 1437*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue in parsing a base URI when loading a URL containing a http-equiv element. 1438*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/440> 1439*6da8f8c4SAndroid Build Coastguard Worker 1440*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue for Java 1.5 / Android 2.2 compatibility, and verify it doesn't regress. 1441*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/375> 1442*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/403> 1443*6da8f8c4SAndroid Build Coastguard Worker 1444*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue that would throw an NPE when trying to set invalid HTML into a title element. 1445*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/410> 1446*6da8f8c4SAndroid Build Coastguard Worker 1447*6da8f8c4SAndroid Build Coastguard Worker * Added support for quoted attribute values in CSS Selectors 1448*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/400> 1449*6da8f8c4SAndroid Build Coastguard Worker 1450*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for nth-of-type selectors with unknown tags. 1451*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/402> 1452*6da8f8c4SAndroid Build Coastguard Worker 1453*6da8f8c4SAndroid Build Coastguard Worker * Added support for 'application/*+xml' mimetypes. 1454*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/444> 1455*6da8f8c4SAndroid Build Coastguard Worker 1456*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for allowing script tags in cleaner Safelists. 1457*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/299> 1458*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/388> 1459*6da8f8c4SAndroid Build Coastguard Worker 1460*6da8f8c4SAndroid Build Coastguard Worker * In FormElements, don't submit disabled inputs, and use 'on' as checkbox value default. 1461*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/489> 1462*6da8f8c4SAndroid Build Coastguard Worker 1463*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.7.3 [2013-Nov-10] 1464*6da8f8c4SAndroid Build Coastguard Worker * Introduced FormElement, providing easy access to form controls and their data, and the ability to submit forms 1465*6da8f8c4SAndroid Build Coastguard Worker with Jsoup.Connect. 1466*6da8f8c4SAndroid Build Coastguard Worker 1467*6da8f8c4SAndroid Build Coastguard Worker * Reduced GC impact during HTML parsing, with 17% fewer objects created, and 3% faster parses. 1468*6da8f8c4SAndroid Build Coastguard Worker 1469*6da8f8c4SAndroid Build Coastguard Worker * Reduced CSS selection time by 26% for common queries. 1470*6da8f8c4SAndroid Build Coastguard Worker 1471*6da8f8c4SAndroid Build Coastguard Worker * Improved HTTP character set detection. 1472*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/325> <https://github.com/jhy/jsoup/issues/321> 1473*6da8f8c4SAndroid Build Coastguard Worker 1474*6da8f8c4SAndroid Build Coastguard Worker * Added Document.location, to get the URL the document was retrieved from. Helpful if connection was redirected. 1475*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/306> 1476*6da8f8c4SAndroid Build Coastguard Worker 1477*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for self-closing script tags. 1478*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/305> 1479*6da8f8c4SAndroid Build Coastguard Worker 1480*6da8f8c4SAndroid Build Coastguard Worker * Fixed a crash when reading an unterminated CDATA section. 1481*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/349> 1482*6da8f8c4SAndroid Build Coastguard Worker 1483*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where elements added via the adoption agency algorithm did not preserve their attributes. 1484*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/313> 1485*6da8f8c4SAndroid Build Coastguard Worker 1486*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when cloning a document with extremely nested elements that could cause a stack-overflow. 1487*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/290> 1488*6da8f8c4SAndroid Build Coastguard Worker 1489*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when connecting or redirecting to a URL that contains a space. 1490*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/354> <https://github.com/jhy/jsoup/issues/114> 1491*6da8f8c4SAndroid Build Coastguard Worker 1492*6da8f8c4SAndroid Build Coastguard Worker * Added support for the HTTP/1.1 Temporary Redirect (307) status code. 1493*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/452> 1494*6da8f8c4SAndroid Build Coastguard Worker 1495*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.7.2 [2013-Jan-27] 1496*6da8f8c4SAndroid Build Coastguard Worker * Added support for supplementary characters outside of the Basic Multilingual Plane. 1497*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/288> <https://github.com/jhy/jsoup/pull/289> 1498*6da8f8c4SAndroid Build Coastguard Worker 1499*6da8f8c4SAndroid Build Coastguard Worker * Added support for structural pseudo CSS selectors, including :first-child, :last-child, :nth-child, :nth-last-child, 1500*6da8f8c4SAndroid Build Coastguard Worker :first-of-type, :last-of-type, :nth-of-type, :nth-last-of-type, :only-child, :only-of-type, :empty, and :root 1501*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/208> 1502*6da8f8c4SAndroid Build Coastguard Worker 1503*6da8f8c4SAndroid Build Coastguard Worker * Added a maximum body response size to Jsoup.Connection, to prevent running out of memory when trying to read 1504*6da8f8c4SAndroid Build Coastguard Worker extremely large documents. The default is 1MB. 1505*6da8f8c4SAndroid Build Coastguard Worker 1506*6da8f8c4SAndroid Build Coastguard Worker * Refactored the Cleaner to traverse rather than recurse child nodes, to avoid the risk of overflowing the stack. 1507*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/246> 1508*6da8f8c4SAndroid Build Coastguard Worker 1509*6da8f8c4SAndroid Build Coastguard Worker * Added Element.insertChildren(), to easily insert a list of child nodes at a specific index. 1510*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/239> 1511*6da8f8c4SAndroid Build Coastguard Worker 1512*6da8f8c4SAndroid Build Coastguard Worker * Added Node.childNodesCopy(), to create an independent copy of a Node's children. 1513*6da8f8c4SAndroid Build Coastguard Worker 1514*6da8f8c4SAndroid Build Coastguard Worker * When parsing in XML mode, preserve XML declarations (<?xml ... ?>). 1515*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/242> 1516*6da8f8c4SAndroid Build Coastguard Worker 1517*6da8f8c4SAndroid Build Coastguard Worker * Introduced Parser.parseXmlFragment(), to allow easy parsing of XML fragments. 1518*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/279> 1519*6da8f8c4SAndroid Build Coastguard Worker 1520*6da8f8c4SAndroid Build Coastguard Worker * Allow Safelist test methods to be extended 1521*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/85> 1522*6da8f8c4SAndroid Build Coastguard Worker 1523*6da8f8c4SAndroid Build Coastguard Worker * Added Document.OutputSettings.outline mode, to aid HTML debugging by printing out in outline mode, similar to 1524*6da8f8c4SAndroid Build Coastguard Worker browser HTML inspectors. 1525*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/273> 1526*6da8f8c4SAndroid Build Coastguard Worker 1527*6da8f8c4SAndroid Build Coastguard Worker * When parsing, allow all tags to self-close. Tags that aren't expected to self-close will get an end tag. 1528*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/258> 1529*6da8f8c4SAndroid Build Coastguard Worker 1530*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when parsing <textarea>/RCData tags containing unescaped closing tags that would drop the trailing >. 1531*6da8f8c4SAndroid Build Coastguard Worker 1532*6da8f8c4SAndroid Build Coastguard Worker * Corrected the javadoc for Element#child() to note that it throws IndexOutOfBounds. 1533*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/277> 1534*6da8f8c4SAndroid Build Coastguard Worker 1535*6da8f8c4SAndroid Build Coastguard Worker * When cloning an Element, reset the classnames set so as not to hold a pointer to the source's. 1536*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/278> 1537*6da8f8c4SAndroid Build Coastguard Worker 1538*6da8f8c4SAndroid Build Coastguard Worker * Limit how far up the stack the formatting adoption agency algorithm will travel, to prevent the chance of a run-away 1539*6da8f8c4SAndroid Build Coastguard Worker parse when the HTML stack is hopelessly deep. 1540*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/234> 1541*6da8f8c4SAndroid Build Coastguard Worker 1542*6da8f8c4SAndroid Build Coastguard Worker * Modified Element.text() to build text by traversing child nodes rather than recursing. This avoids stack-overflow 1543*6da8f8c4SAndroid Build Coastguard Worker errors when the DOM is very deep and the VM stack-size is low. 1544*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/271> 1545*6da8f8c4SAndroid Build Coastguard Worker 1546*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.7.1 [2012-Sep-23] 1547*6da8f8c4SAndroid Build Coastguard Worker * Improved parse time, now 2.3x faster than previous release, with lower memory consumption. 1548*6da8f8c4SAndroid Build Coastguard Worker 1549*6da8f8c4SAndroid Build Coastguard Worker * Reduced memory consumption when selecting elements. 1550*6da8f8c4SAndroid Build Coastguard Worker 1551*6da8f8c4SAndroid Build Coastguard Worker * Introduced finer granularity of exceptions in Jsoup.connect, including HttpStatusException and 1552*6da8f8c4SAndroid Build Coastguard Worker UnsupportedMimeTypeException. 1553*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/229> 1554*6da8f8c4SAndroid Build Coastguard Worker 1555*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when determining the Windows-1254 character-set from a meta tag when run in the Turkish locale. 1556*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/191> 1557*6da8f8c4SAndroid Build Coastguard Worker 1558*6da8f8c4SAndroid Build Coastguard Worker * Fixed whitespace preservation in <textarea> tags. 1559*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/167> 1560*6da8f8c4SAndroid Build Coastguard Worker 1561*6da8f8c4SAndroid Build Coastguard Worker * In jsoup.connect, fail faster if the return content type is not supported. 1562*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/153> 1563*6da8f8c4SAndroid Build Coastguard Worker 1564*6da8f8c4SAndroid Build Coastguard Worker * In jsoup.clean, allow custom OutputSettings, to control pretty printing, character set, and entity escaping. 1565*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/148> 1566*6da8f8c4SAndroid Build Coastguard Worker 1567*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue that prevented frameset documents to be cleaned by the Cleaner. 1568*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/154> 1569*6da8f8c4SAndroid Build Coastguard Worker 1570*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when normalising whitespace for strings containing high-surrogate characters. 1571*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/214> 1572*6da8f8c4SAndroid Build Coastguard Worker 1573*6da8f8c4SAndroid Build Coastguard Worker * If a server doesn't specify a content-type header, treat that as OK. 1574*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/213> 1575*6da8f8c4SAndroid Build Coastguard Worker 1576*6da8f8c4SAndroid Build Coastguard Worker * If a server returns an unsupported character-set header, attempt to decode the content with the default charset 1577*6da8f8c4SAndroid Build Coastguard Worker (UTF8), instead of bailing with an unsupported charset exception. 1578*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/215> 1579*6da8f8c4SAndroid Build Coastguard Worker 1580*6da8f8c4SAndroid Build Coastguard Worker * Removed an unnecessary synchronisation in Tag.valueOf, allowing multi-threaded parsing to run faster. 1581*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/238> 1582*6da8f8c4SAndroid Build Coastguard Worker 1583*6da8f8c4SAndroid Build Coastguard Worker * Made entity decoding less greedy, so that non-entities are less likely to be incorrectly treated as entities. 1584*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/224> 1585*6da8f8c4SAndroid Build Coastguard Worker 1586*6da8f8c4SAndroid Build Coastguard Worker * Whitespace normalise document.title() output. 1587*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/168> 1588*6da8f8c4SAndroid Build Coastguard Worker 1589*6da8f8c4SAndroid Build Coastguard Worker * In Jsoup.connection, enforce a connection disconnect after every connect. This precludes keep-alive connections to 1590*6da8f8c4SAndroid Build Coastguard Worker the same host, but in practise many implementations will leak connections, particularly on error. 1591*6da8f8c4SAndroid Build Coastguard Worker 1592*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.6.3 [2012-May-28] 1593*6da8f8c4SAndroid Build Coastguard Worker * Fixed parsing of group-or commas in CSS selectors, to correctly handle sub-queries containing commas. 1594*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/179> 1595*6da8f8c4SAndroid Build Coastguard Worker 1596*6da8f8c4SAndroid Build Coastguard Worker * If a node has no parent, return null on previousSibling and nextSibling instead of throwing a null pointer exception. 1597*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/184> 1598*6da8f8c4SAndroid Build Coastguard Worker 1599*6da8f8c4SAndroid Build Coastguard Worker * Updated Node.siblingNodes() and Element.siblingElements() to exclude the current node (a node is not its own sibling). 1600*6da8f8c4SAndroid Build Coastguard Worker 1601*6da8f8c4SAndroid Build Coastguard Worker * Fixed HTML entity parser to correctly parse entities like frac14 (letter + number combo). 1602*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/145> 1603*6da8f8c4SAndroid Build Coastguard Worker 1604*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue where contents of a script tag within a comment could be incorrectly parsed. 1605*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/115> 1606*6da8f8c4SAndroid Build Coastguard Worker 1607*6da8f8c4SAndroid Build Coastguard Worker * Fixed GAE support: load HTML entities from a file on startup, instead of embedding in the class. 1608*6da8f8c4SAndroid Build Coastguard Worker 1609*6da8f8c4SAndroid Build Coastguard Worker * Fixed NPE when HTML fragment parsing a <style> tag 1610*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/189> 1611*6da8f8c4SAndroid Build Coastguard Worker 1612*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue with :all pseudo-tag in HTML sanitizer when cleaning tags previously defined in safelist 1613*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/156> 1614*6da8f8c4SAndroid Build Coastguard Worker 1615*6da8f8c4SAndroid Build Coastguard Worker * Fixed NPE in Parser.parseFragment() when context parameter is null. 1616*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/195> 1617*6da8f8c4SAndroid Build Coastguard Worker 1618*6da8f8c4SAndroid Build Coastguard Worker * In HTML Safelists, when defining allowed attributes for a tag, automatically add the tag to the allowed list. 1619*6da8f8c4SAndroid Build Coastguard Worker 1620*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.6.2 [2012-Mar-27] 1621*6da8f8c4SAndroid Build Coastguard Worker * Added a simplified XML parsing mode, which can usefully parse valid and invalid XML, but does not enforce any HTML 1622*6da8f8c4SAndroid Build Coastguard Worker document structure or special tag behaviour. 1623*6da8f8c4SAndroid Build Coastguard Worker 1624*6da8f8c4SAndroid Build Coastguard Worker * Added the optional ability to track errors when tokenising and parsing. 1625*6da8f8c4SAndroid Build Coastguard Worker 1626*6da8f8c4SAndroid Build Coastguard Worker * Added jsoup.connect.cookies(Map) method, to set multiple cookies at once, possibly from a prior request. 1627*6da8f8c4SAndroid Build Coastguard Worker 1628*6da8f8c4SAndroid Build Coastguard Worker * Added Element.textNodes() and Element.dataNodes(), to easily access an element's children text nodes and data nodes. 1629*6da8f8c4SAndroid Build Coastguard Worker 1630*6da8f8c4SAndroid Build Coastguard Worker * Added an example program that demonstrates how to format HTML as plain-text, and the use of the NodeVisitor interface. 1631*6da8f8c4SAndroid Build Coastguard Worker 1632*6da8f8c4SAndroid Build Coastguard Worker * Added Node.traverse() and Elements.traverse() methods, to iterate through a node's descendants. 1633*6da8f8c4SAndroid Build Coastguard Worker 1634*6da8f8c4SAndroid Build Coastguard Worker * Updated jsoup.connect so that when requests made as POSTs are redirected, the redirect is followed as a GET. 1635*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/120> 1636*6da8f8c4SAndroid Build Coastguard Worker 1637*6da8f8c4SAndroid Build Coastguard Worker * Updated the Cleaner and Safelists to optionally preserve related links in elements, instead of converting them 1638*6da8f8c4SAndroid Build Coastguard Worker to absolute links. 1639*6da8f8c4SAndroid Build Coastguard Worker 1640*6da8f8c4SAndroid Build Coastguard Worker * Updated the Cleaner to support custom allowed protocols such as "cid:" and "data:". 1641*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/127> 1642*6da8f8c4SAndroid Build Coastguard Worker 1643*6da8f8c4SAndroid Build Coastguard Worker * Updated handling of <base href> tags, to act on only the first one seen when parsing, to align with modern browsers. 1644*6da8f8c4SAndroid Build Coastguard Worker 1645*6da8f8c4SAndroid Build Coastguard Worker * Updated Node.setBaseUri(), to recursively set on all the node's descendants. 1646*6da8f8c4SAndroid Build Coastguard Worker 1647*6da8f8c4SAndroid Build Coastguard Worker * Fixed handling of null characters within comments. 1648*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/121> 1649*6da8f8c4SAndroid Build Coastguard Worker 1650*6da8f8c4SAndroid Build Coastguard Worker * Tweaked escaped entity detection in attributes to not treat &entity_... as an entity form. 1651*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/129> 1652*6da8f8c4SAndroid Build Coastguard Worker 1653*6da8f8c4SAndroid Build Coastguard Worker * Fixed doctype tokeniser to allow whitespace between name and public identifier. 1654*6da8f8c4SAndroid Build Coastguard Worker 1655*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue where comments within a table tag would be duplicate-fostered into body. 1656*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/165> 1657*6da8f8c4SAndroid Build Coastguard Worker 1658*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where a spurious byte-order-mark at the start of a document would cause the parser to miss head 1659*6da8f8c4SAndroid Build Coastguard Worker contents. 1660*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/134> 1661*6da8f8c4SAndroid Build Coastguard Worker 1662*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue where content after a frameset could cause a NPE crash. Now correctly implements spec and ignores 1663*6da8f8c4SAndroid Build Coastguard Worker the trailing content. 1664*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/162> 1665*6da8f8c4SAndroid Build Coastguard Worker 1666*6da8f8c4SAndroid Build Coastguard Worker * Tweaked whitespace checks to align with HTML spec 1667*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/175> 1668*6da8f8c4SAndroid Build Coastguard Worker 1669*6da8f8c4SAndroid Build Coastguard Worker * Tweaked HTML output of closing script and style tags to not add an extraneous newline when pretty-printing. 1670*6da8f8c4SAndroid Build Coastguard Worker 1671*6da8f8c4SAndroid Build Coastguard Worker * Substantially reduced default memory allocation within Node.outerHtml, to reduce memory pressure when serialising 1672*6da8f8c4SAndroid Build Coastguard Worker smaller DOMs. 1673*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/143> 1674*6da8f8c4SAndroid Build Coastguard Worker 1675*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.6.1 [2011-Jul-02] 1676*6da8f8c4SAndroid Build Coastguard Worker * Fixed Java 1.5 compatibility. 1677*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/103> 1678*6da8f8c4SAndroid Build Coastguard Worker 1679*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue when parsing <script> tags in body where the tokeniser wouldn't switch to the InScript state, which 1680*6da8f8c4SAndroid Build Coastguard Worker meant that data wasn't parsed correctly. 1681*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/104> 1682*6da8f8c4SAndroid Build Coastguard Worker 1683*6da8f8c4SAndroid Build Coastguard Worker * Fixed an issue with a missing quote when serialising DocumentType nodes. 1684*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/109> 1685*6da8f8c4SAndroid Build Coastguard Worker 1686*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue where a single 0 character was lexed incorrectly as a null character. 1687*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/107> 1688*6da8f8c4SAndroid Build Coastguard Worker 1689*6da8f8c4SAndroid Build Coastguard Worker * Fixed normalisation of carriage returns to newlines on input HTML. 1690*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/110> 1691*6da8f8c4SAndroid Build Coastguard Worker 1692*6da8f8c4SAndroid Build Coastguard Worker * Disabled memory mapped files when loading files from disk, to improve compatibility in Windows environments. 1693*6da8f8c4SAndroid Build Coastguard Worker 1694*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.6.0 [2011-Jun-13] 1695*6da8f8c4SAndroid Build Coastguard Worker * HTML5 conformant parser. Complete reimplementation of HTML tokenisation and parsing, to implement the 1696*6da8f8c4SAndroid Build Coastguard Worker http://whatwg.org/html spec. This ensures jsoup parses HTML identically to current modern browsers. 1697*6da8f8c4SAndroid Build Coastguard Worker 1698*6da8f8c4SAndroid Build Coastguard Worker * When parsing files from disk, files are loaded via memory mapping, to increase parse speed. 1699*6da8f8c4SAndroid Build Coastguard Worker 1700*6da8f8c4SAndroid Build Coastguard Worker * Reduced memory overhead and lowered garbage collector pressure with Attribute, Node and Element model optimisations. 1701*6da8f8c4SAndroid Build Coastguard Worker 1702*6da8f8c4SAndroid Build Coastguard Worker * Improved "abs:" absolute URL handling in Elements.attr("abs:href") and Node.hasAttr("abs:href"). 1703*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/97> 1704*6da8f8c4SAndroid Build Coastguard Worker 1705*6da8f8c4SAndroid Build Coastguard Worker * Fixed cookie handling issue in jsoup.Connect where empty cookies would cause a validation exception. 1706*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/87> 1707*6da8f8c4SAndroid Build Coastguard Worker 1708*6da8f8c4SAndroid Build Coastguard Worker * Added jsoup.Connect configuration options to allow HTTP errors to be ignored, and the content-type to be ignored. 1709*6da8f8c4SAndroid Build Coastguard Worker Contributed by Jesse Piascik (piascikj) 1710*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/pull/78> 1711*6da8f8c4SAndroid Build Coastguard Worker 1712*6da8f8c4SAndroid Build Coastguard Worker * Added Node.before(node) and Node.after(node), to allow existing nodes to be moved, or new nodes to be inserted, into 1713*6da8f8c4SAndroid Build Coastguard Worker precise DOM positions. 1714*6da8f8c4SAndroid Build Coastguard Worker 1715*6da8f8c4SAndroid Build Coastguard Worker * Added Node.unwrap() and Elements.unwrap(), to remove a node but keep its contents. Useful for e.g. removing unwanted 1716*6da8f8c4SAndroid Build Coastguard Worker formatting tags. 1717*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/100> 1718*6da8f8c4SAndroid Build Coastguard Worker 1719*6da8f8c4SAndroid Build Coastguard Worker * Now handles unclosed <title> tags in document by breaking out of the title at the next start tag, instead of 1720*6da8f8c4SAndroid Build Coastguard Worker eating up to the end of the document. 1721*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/82> 1722*6da8f8c4SAndroid Build Coastguard Worker 1723*6da8f8c4SAndroid Build Coastguard Worker * Added OSGi bundle support to the jsoup package jar. 1724*6da8f8c4SAndroid Build Coastguard Worker <https://github.com/jhy/jsoup/issues/98> 1725*6da8f8c4SAndroid Build Coastguard Worker 1726*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.5.2 [2011-Feb-27] 1727*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue with selector parser where some boolean AND + OR combined queries (e.g. "meta[http-equiv], meta[content]") 1728*6da8f8c4SAndroid Build Coastguard Worker were being parsed incorrectly as OR only queries (e.g. former as "meta, [http-equiv], meta[content]") 1729*6da8f8c4SAndroid Build Coastguard Worker 1730*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue where a content-type specified in a meta tag may not be reliably detected, due to the above issue. 1731*6da8f8c4SAndroid Build Coastguard Worker 1732*6da8f8c4SAndroid Build Coastguard Worker * Updated Element.text() and Element.ownText() methods to ensure <br> tags output as whitespace. 1733*6da8f8c4SAndroid Build Coastguard Worker 1734*6da8f8c4SAndroid Build Coastguard Worker * Tweaked Element.outerHtml() method to not generate initial newline on first output element. 1735*6da8f8c4SAndroid Build Coastguard Worker 1736*6da8f8c4SAndroid Build Coastguard Worker *** Release 1.5.1 [2011-Feb-19] 1737*6da8f8c4SAndroid Build Coastguard Worker 1738*6da8f8c4SAndroid Build Coastguard Worker * Integrated new single-pass selector evaluators, contributed by knz (Anton Kazennikov). This significantly speeds up 1739*6da8f8c4SAndroid Build Coastguard Worker the execution of combined selector queries. 1740*6da8f8c4SAndroid Build Coastguard Worker 1741*6da8f8c4SAndroid Build Coastguard Worker * Implemented workaround to fix Scala support. Contributed by bbeck (Brandon Beck). 1742*6da8f8c4SAndroid Build Coastguard Worker 1743*6da8f8c4SAndroid Build Coastguard Worker * Added ability to change an element's tag with Element.tagName(String), and to change many at once 1744*6da8f8c4SAndroid Build Coastguard Worker with Elements.tagName(String). 1745*6da8f8c4SAndroid Build Coastguard Worker 1746*6da8f8c4SAndroid Build Coastguard Worker * Added Node.wrap(html), Node.before(html), and Node.after(html), to allow HTML to be easily added to all nodes. These 1747*6da8f8c4SAndroid Build Coastguard Worker functions were previously supported on Elements only. 1748*6da8f8c4SAndroid Build Coastguard Worker 1749*6da8f8c4SAndroid Build Coastguard Worker * Added TextNode.splitText(index), which allows a text node to be split into two nodes at a specified index point. 1750*6da8f8c4SAndroid Build Coastguard Worker This is convenient if you need to surround some text in an element. 1751*6da8f8c4SAndroid Build Coastguard Worker 1752*6da8f8c4SAndroid Build Coastguard Worker * Updated Jsoup.Connection so that cookies set on a redirect response will be included on both the redirected request 1753*6da8f8c4SAndroid Build Coastguard Worker and response. 1754*6da8f8c4SAndroid Build Coastguard Worker 1755*6da8f8c4SAndroid Build Coastguard Worker * Infinite redirection loops in Jsoup.Connect are now prevented. 1756*6da8f8c4SAndroid Build Coastguard Worker 1757*6da8f8c4SAndroid Build Coastguard Worker * Allow Jsoup.Connect to parse application/xml and application/xhtml+xml responses. 1758*6da8f8c4SAndroid Build Coastguard Worker 1759*6da8f8c4SAndroid Build Coastguard Worker * Modified Jsoup.Connect to always follow relative links, regardless of the underlying HTTP sub-system. 1760*6da8f8c4SAndroid Build Coastguard Worker 1761*6da8f8c4SAndroid Build Coastguard Worker * Defined U (underline) element as an inline tag. 1762*6da8f8c4SAndroid Build Coastguard Worker 1763*6da8f8c4SAndroid Build Coastguard Worker * Force strict entity matching (must be &xxx; and not &xxx) in element attributes. 1764*6da8f8c4SAndroid Build Coastguard Worker 1765*6da8f8c4SAndroid Build Coastguard Worker * Implemented clone method for Elements (contributed by knz). 1766*6da8f8c4SAndroid Build Coastguard Worker 1767*6da8f8c4SAndroid Build Coastguard Worker * Fixed tokeniser optimisation when scanning for missing data element close tags. 1768*6da8f8c4SAndroid Build Coastguard Worker 1769*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue when using descendant regex attribute selectors. 1770*6da8f8c4SAndroid Build Coastguard Worker 1771*6da8f8c4SAndroid Build Coastguard Worker *** Release 1.4.1 [2010-Nov-23] 1772*6da8f8c4SAndroid Build Coastguard Worker 1773*6da8f8c4SAndroid Build Coastguard Worker * Added ability to load and parse HTML from an input stream. 1774*6da8f8c4SAndroid Build Coastguard Worker 1775*6da8f8c4SAndroid Build Coastguard Worker * Implemented Node.clone() to create deep, independent copies of Nodes, Elements, and Documents. 1776*6da8f8c4SAndroid Build Coastguard Worker 1777*6da8f8c4SAndroid Build Coastguard Worker * Added :not() selector, to find elements that do not match the selector. E.g. div:not(.logo) finds divs that 1778*6da8f8c4SAndroid Build Coastguard Worker do not have the "logo" class name. 1779*6da8f8c4SAndroid Build Coastguard Worker 1780*6da8f8c4SAndroid Build Coastguard Worker * Added Elements.not(selector) method, to remove undesired results from selector results. 1781*6da8f8c4SAndroid Build Coastguard Worker 1782*6da8f8c4SAndroid Build Coastguard Worker * Implemented DataNode.setWholeData() to allow updating of script and style data contents. 1783*6da8f8c4SAndroid Build Coastguard Worker 1784*6da8f8c4SAndroid Build Coastguard Worker * Relaxed parse rules of H1 - H6, to allow nested content. This is against spec, but matches browser and publisher 1785*6da8f8c4SAndroid Build Coastguard Worker behaviour. 1786*6da8f8c4SAndroid Build Coastguard Worker 1787*6da8f8c4SAndroid Build Coastguard Worker * Relaxed parse rule of SPAN to treat as block, to allow nested block content. 1788*6da8f8c4SAndroid Build Coastguard Worker 1789*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue in jsoup.connect when extracting character set from content-type header; now supports quoted 1790*6da8f8c4SAndroid Build Coastguard Worker charset declaration. 1791*6da8f8c4SAndroid Build Coastguard Worker 1792*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for jsoup.connect to follow redirects between http & https URLs. 1793*6da8f8c4SAndroid Build Coastguard Worker 1794*6da8f8c4SAndroid Build Coastguard Worker * Document normalisation now more enthusiastically enforces the correct document structure. 1795*6da8f8c4SAndroid Build Coastguard Worker 1796*6da8f8c4SAndroid Build Coastguard Worker * Support node.outerHtml() method when node has no parent (e.g. when it has been removed from its DOM tree) 1797*6da8f8c4SAndroid Build Coastguard Worker 1798*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for HTML entities with numbers in name (e.g. ¾, ¹). 1799*6da8f8c4SAndroid Build Coastguard Worker 1800*6da8f8c4SAndroid Build Coastguard Worker * Fixed absolute URL generation from relative URLs which are only query strings. 1801*6da8f8c4SAndroid Build Coastguard Worker 1802*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.3.3 [2010-Sep-19] 1803*6da8f8c4SAndroid Build Coastguard Worker * Implemented Elements.empty() and Elements.remove(). This allows easy element removal, like: 1804*6da8f8c4SAndroid Build Coastguard Worker doc.select("iframe").remove(); 1805*6da8f8c4SAndroid Build Coastguard Worker 1806*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue in Entities when unescaping $ ("$") 1807*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/34> 1808*6da8f8c4SAndroid Build Coastguard Worker 1809*6da8f8c4SAndroid Build Coastguard Worker * Added restricted XHTML output entity option 1810*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/35> 1811*6da8f8c4SAndroid Build Coastguard Worker 1812*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.3.2 [2010-Aug-30] 1813*6da8f8c4SAndroid Build Coastguard Worker * Treat HTTP headers as case insensitive in Jsoup.Connection. Improves compatibility for HTTP responses. 1814*6da8f8c4SAndroid Build Coastguard Worker 1815*6da8f8c4SAndroid Build Coastguard Worker * Improved malformed table parsing by implementing ignorable end tags. 1816*6da8f8c4SAndroid Build Coastguard Worker 1817*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.3.1 [2010-Aug-23] 1818*6da8f8c4SAndroid Build Coastguard Worker * Removed dependency on Apache Commons-lang. Jsoup now has no external dependencies. 1819*6da8f8c4SAndroid Build Coastguard Worker 1820*6da8f8c4SAndroid Build Coastguard Worker * Added new Connection implementation, to enable easier and richer HTTP requests that parse to Documents. This includes 1821*6da8f8c4SAndroid Build Coastguard Worker support for gzip responses, cookies, headers, data parameters, user-agent, referrer, etc. 1822*6da8f8c4SAndroid Build Coastguard Worker 1823*6da8f8c4SAndroid Build Coastguard Worker * Added Element.ownText() method, to get only the direct text of an element, not including the text of its children. 1824*6da8f8c4SAndroid Build Coastguard Worker 1825*6da8f8c4SAndroid Build Coastguard Worker * Added support for selectors :containsOwn(text) and :matchesOwn(regex), to supplement Element.ownText(). 1826*6da8f8c4SAndroid Build Coastguard Worker 1827*6da8f8c4SAndroid Build Coastguard Worker * Added support for non-pretty-printed HTML output, to more closely mirror the input HTML. 1828*6da8f8c4SAndroid Build Coastguard Worker 1829*6da8f8c4SAndroid Build Coastguard Worker * Further speed optimisations for parsing and output generation. 1830*6da8f8c4SAndroid Build Coastguard Worker 1831*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for case-sensitive HTML escape entities. 1832*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/31> 1833*6da8f8c4SAndroid Build Coastguard Worker 1834*6da8f8c4SAndroid Build Coastguard Worker * Fixed issue when parsing tags with keyless attributes. 1835*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/32> 1836*6da8f8c4SAndroid Build Coastguard Worker 1837*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.2.3 [2010-Aug-04] 1838*6da8f8c4SAndroid Build Coastguard Worker * Added support for automatic input character set detection and decoding. Jsoup now automatically detects the encoding 1839*6da8f8c4SAndroid Build Coastguard Worker character set when parsing HTML from a File or URL. The parser checks the content-type header, then the 1840*6da8f8c4SAndroid Build Coastguard Worker <meta http-equiv> or <meta charset> tag, and finally falls back to UTF-8. 1841*6da8f8c4SAndroid Build Coastguard Worker 1842*6da8f8c4SAndroid Build Coastguard Worker * Added ability to configure the document's output charset, to control which characters are HTML escaped, and which 1843*6da8f8c4SAndroid Build Coastguard Worker are kept intact. The output charset defaults to the document's input charset. This simplifies non-ascii output. 1844*6da8f8c4SAndroid Build Coastguard Worker 1845*6da8f8c4SAndroid Build Coastguard Worker * Added full support for all new HTML5 tags. 1846*6da8f8c4SAndroid Build Coastguard Worker 1847*6da8f8c4SAndroid Build Coastguard Worker * Added support for HTML5 dataset custom data attributes, with the Element.dataset() map. 1848*6da8f8c4SAndroid Build Coastguard Worker 1849*6da8f8c4SAndroid Build Coastguard Worker * Added support for the [^attributePrefix] selector query, to find elements with attributes starting with a prefix. 1850*6da8f8c4SAndroid Build Coastguard Worker Useful for finding elements with datasets: [^data-] matches <p data-name="jsoup"> 1851*6da8f8c4SAndroid Build Coastguard Worker 1852*6da8f8c4SAndroid Build Coastguard Worker * Added support for namespaced elements (<fb:name>) and selectors to find them (fb|name) 1853*6da8f8c4SAndroid Build Coastguard Worker 1854*6da8f8c4SAndroid Build Coastguard Worker * Implemented Node.ownerDocument DOM method 1855*6da8f8c4SAndroid Build Coastguard Worker 1856*6da8f8c4SAndroid Build Coastguard Worker * Improved implicit table element handling (particularly around thead, tbody, and tfoot). 1857*6da8f8c4SAndroid Build Coastguard Worker 1858*6da8f8c4SAndroid Build Coastguard Worker * Improved HTML output format for empty elements and auto-detected self closing tags 1859*6da8f8c4SAndroid Build Coastguard Worker 1860*6da8f8c4SAndroid Build Coastguard Worker * Changed DT & DD tags to block-mode tags, to follow practice over spec 1861*6da8f8c4SAndroid Build Coastguard Worker 1862*6da8f8c4SAndroid Build Coastguard Worker * Added support for tag names with - and _ (<abc_foo>, <abc-foo>) 1863*6da8f8c4SAndroid Build Coastguard Worker 1864*6da8f8c4SAndroid Build Coastguard Worker * Handle tags with internal trailing space (<foo >) 1865*6da8f8c4SAndroid Build Coastguard Worker 1866*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for character class regular expressions in [attr=~regex] selector 1867*6da8f8c4SAndroid Build Coastguard Worker 1868*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.2.2 [2010-Jul-11] 1869*6da8f8c4SAndroid Build Coastguard Worker 1870*6da8f8c4SAndroid Build Coastguard Worker * Performance optimisation: 1871*6da8f8c4SAndroid Build Coastguard Worker - core HTML parser engine now 3.5 times faster 1872*6da8f8c4SAndroid Build Coastguard Worker - HTML generator now 2.5 times faster 1873*6da8f8c4SAndroid Build Coastguard Worker - much lower memory use and garbage collection time 1874*6da8f8c4SAndroid Build Coastguard Worker 1875*6da8f8c4SAndroid Build Coastguard Worker * Added support for :matches(regex) selector, to find elements containing text matching regular expression 1876*6da8f8c4SAndroid Build Coastguard Worker 1877*6da8f8c4SAndroid Build Coastguard Worker * Added support for [key~=regex] attribute selector, to find elements with attribute values matching regular expression 1878*6da8f8c4SAndroid Build Coastguard Worker 1879*6da8f8c4SAndroid Build Coastguard Worker * Upgraded the selector query parser to allow nested selectors like 'div:has(p:matches(regex))' 1880*6da8f8c4SAndroid Build Coastguard Worker 1881*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.2.1 [2010-Jun-21] 1882*6da8f8c4SAndroid Build Coastguard Worker * Added .before(html) and .after(html) methods to Element and Elements, to insert sibling HTML 1883*6da8f8c4SAndroid Build Coastguard Worker 1884*6da8f8c4SAndroid Build Coastguard Worker * Added :contains(text) selector, to search for elements containing the specified text 1885*6da8f8c4SAndroid Build Coastguard Worker 1886*6da8f8c4SAndroid Build Coastguard Worker * Added :has(selector) pseudo-selector 1887*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/20> 1888*6da8f8c4SAndroid Build Coastguard Worker 1889*6da8f8c4SAndroid Build Coastguard Worker * Added Element#parents and Elements#parents to retrieve an element's ancestor chain 1890*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/20> 1891*6da8f8c4SAndroid Build Coastguard Worker 1892*6da8f8c4SAndroid Build Coastguard Worker * Fixes an issue where appending / prepending rows to a table (or to similar implicit 1893*6da8f8c4SAndroid Build Coastguard Worker element structures) would create a redundant wrapping elements 1894*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/21> 1895*6da8f8c4SAndroid Build Coastguard Worker 1896*6da8f8c4SAndroid Build Coastguard Worker * Improved implicit close tag heuristic detection when parsing malformed HTML 1897*6da8f8c4SAndroid Build Coastguard Worker 1898*6da8f8c4SAndroid Build Coastguard Worker * Fixes an issue where text content after a script (or other data-node) was 1899*6da8f8c4SAndroid Build Coastguard Worker incorrectly added to the data node. 1900*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/22> 1901*6da8f8c4SAndroid Build Coastguard Worker 1902*6da8f8c4SAndroid Build Coastguard Worker * Fixes an issue where text order was incorrect when parsing pre-document 1903*6da8f8c4SAndroid Build Coastguard Worker HTML. 1904*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/23> 1905*6da8f8c4SAndroid Build Coastguard Worker 1906*6da8f8c4SAndroid Build Coastguard Worker*** Release 1.1.1 [2010-Jun-08] 1907*6da8f8c4SAndroid Build Coastguard Worker * Added selector support for :eq, :lt, and :gt 1908*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/16> 1909*6da8f8c4SAndroid Build Coastguard Worker 1910*6da8f8c4SAndroid Build Coastguard Worker * Added TextNode#text and TextNode#text(String) 1911*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/18> 1912*6da8f8c4SAndroid Build Coastguard Worker 1913*6da8f8c4SAndroid Build Coastguard Worker * Throw exception if trying to parse non-text content 1914*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/17> 1915*6da8f8c4SAndroid Build Coastguard Worker 1916*6da8f8c4SAndroid Build Coastguard Worker * Added Node#remove and Node#replaceWith 1917*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/19> 1918*6da8f8c4SAndroid Build Coastguard Worker 1919*6da8f8c4SAndroid Build Coastguard Worker * Allow _ and - in CSS ID selectors (per CSS spec). 1920*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/10> 1921*6da8f8c4SAndroid Build Coastguard Worker 1922*6da8f8c4SAndroid Build Coastguard Worker * Relative links are resolved to absolute when cleaning, to normalize 1923*6da8f8c4SAndroid Build Coastguard Worker output and to verify safe protocol. (Were previously discarded.) 1924*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/12> 1925*6da8f8c4SAndroid Build Coastguard Worker 1926*6da8f8c4SAndroid Build Coastguard Worker * Allow combinators at start of selector query, for query refinements 1927*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/13> 1928*6da8f8c4SAndroid Build Coastguard Worker 1929*6da8f8c4SAndroid Build Coastguard Worker * Added Element#val() and #val(String) methods, for form values 1930*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/14> 1931*6da8f8c4SAndroid Build Coastguard Worker 1932*6da8f8c4SAndroid Build Coastguard Worker * Changed textarea contents to parse as TextNodes, not DataNodes, 1933*6da8f8c4SAndroid Build Coastguard Worker so contents visible to text() (and val(), as treated as form input) 1934*6da8f8c4SAndroid Build Coastguard Worker 1935*6da8f8c4SAndroid Build Coastguard Worker * Fixed support for Java 1.5 1936*6da8f8c4SAndroid Build Coastguard Worker 1937*6da8f8c4SAndroid Build Coastguard Worker*** Release 0.3.1 (2010-Feb-20) 1938*6da8f8c4SAndroid Build Coastguard Worker * New features: supports Elements#html(), html(String), 1939*6da8f8c4SAndroid Build Coastguard Worker prepend(String), append(String); bulk methods for corresponding 1940*6da8f8c4SAndroid Build Coastguard Worker methods in Element. 1941*6da8f8c4SAndroid Build Coastguard Worker 1942*6da8f8c4SAndroid Build Coastguard Worker * New feature: Jsoup.isValid(html, safelist) method for user input 1943*6da8f8c4SAndroid Build Coastguard Worker form validation. 1944*6da8f8c4SAndroid Build Coastguard Worker 1945*6da8f8c4SAndroid Build Coastguard Worker * Improved Elements.attr(String) to find first matching element 1946*6da8f8c4SAndroid Build Coastguard Worker with attribute. 1947*6da8f8c4SAndroid Build Coastguard Worker 1948*6da8f8c4SAndroid Build Coastguard Worker * Fixed assertion error when cleaning HTML with empty attribute 1949*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/7> 1950*6da8f8c4SAndroid Build Coastguard Worker 1951*6da8f8c4SAndroid Build Coastguard Worker*** Release 0.2.2 (2010-Feb-07) 1952*6da8f8c4SAndroid Build Coastguard Worker * jsoup packages are now available in the Maven central repository. 1953*6da8f8c4SAndroid Build Coastguard Worker 1954*6da8f8c4SAndroid Build Coastguard Worker * New feature: supports Element#addClass, removeClass, toggleClass; 1955*6da8f8c4SAndroid Build Coastguard Worker also collection class methods on Elements. 1956*6da8f8c4SAndroid Build Coastguard Worker * New feature: supports Element#wrap(html) and Elements#wrap(html). 1957*6da8f8c4SAndroid Build Coastguard Worker * New selector syntax: supports E + F adjacent sibling selector 1958*6da8f8c4SAndroid Build Coastguard Worker * New selector syntax: supports E ~ F preceding sibling selector 1959*6da8f8c4SAndroid Build Coastguard Worker * New: supports Element#elementSiblingIndex() 1960*6da8f8c4SAndroid Build Coastguard Worker 1961*6da8f8c4SAndroid Build Coastguard Worker * Improved document normalisation. 1962*6da8f8c4SAndroid Build Coastguard Worker * Improved HTML string output format (pretty-print) 1963*6da8f8c4SAndroid Build Coastguard Worker 1964*6da8f8c4SAndroid Build Coastguard Worker * Fixed absolute URL resolution issue when a base tag has no href. 1965*6da8f8c4SAndroid Build Coastguard Worker 1966*6da8f8c4SAndroid Build Coastguard Worker*** Release 0.1.2 (2010-Feb-02) 1967*6da8f8c4SAndroid Build Coastguard Worker * Fixed unrecognised tag handler to be more permissive 1968*6da8f8c4SAndroid Build Coastguard Worker <http://github.com/jhy/jsoup/issues/issue/1> 1969*6da8f8c4SAndroid Build Coastguard Worker 1970*6da8f8c4SAndroid Build Coastguard Worker 1971*6da8f8c4SAndroid Build Coastguard Worker*** Release 0.1.1 (2010-Jan-31) 1972*6da8f8c4SAndroid Build Coastguard Worker * Initial beta release of jsoup 1973