1# Obtaining YAML Line Numbers 2 3## Scenario 1 - finding YAML line numbers from the JSON tree 4 5A great feature of json-schema-validator is it's ability to validate YAML documents against a JSON Scheme. The manner in which this is done though, by pre-processing the YAML into a tree of [JsonNode](https://fasterxml.github.io/jackson-databind/javadoc/2.10/com/fasterxml/jackson/databind/JsonNode.html) objects, breaks the connection back to the original YAML source file. Very commonly, once the YAML has been validated against the schema, there may be additional processing and checking for semantic or content errors or inconsistency in the JSON tree. From an end user point of view, the ideal is to report such errors using line and column references back to the original YAML, but this information is not readily available from the processed JSON tree. 6 7### Scenario 1, solution part 1 - capturing line details during initial parsing 8 9One solution is to use a custom [JsonNodeFactory](https://fasterxml.github.io/jackson-databind/javadoc/2.10/com/fasterxml/jackson/databind/node/JsonNodeFactory.html) that returns custom JsonNode objects which are created during initial parsing, and which record the original YAML locations that were being parsed at the time they were created. The example below shows this 10 11```java 12 public static class MyNodeFactory extends JsonNodeFactory 13 { 14 YAMLParser yp; 15 16 public MyNodeFactory(YAMLParser yp) 17 { 18 super(); 19 this.yp = yp; 20 } 21 22 public ArrayNode arrayNode() 23 { 24 return new MyArrayNode(this, yp.getTokenLocation(), yp.getCurrentLocation()); 25 } 26 27 public BooleanNode booleanNode(boolean v) 28 { 29 return new MyBooleanNode(v, yp.getTokenLocation(), yp.getCurrentLocation()); 30 } 31 32 public NumericNode numberNode(int v) 33 { 34 return new MyIntNode(v, yp.getTokenLocation(), yp.getCurrentLocation()); 35 } 36 37 public NullNode nullNode() 38 { 39 return new MyNullNode(yp.getTokenLocation(), yp.getCurrentLocation()); 40 } 41 42 public ObjectNode objectNode() 43 { 44 return new MyObjectNode(this, yp.getTokenLocation(), yp.getCurrentLocation()); 45 } 46 47 public TextNode textNode(String text) 48 { 49 return (text != null) ? new MyTextNode(text, yp.getTokenLocation(), yp.getCurrentLocation()) : null; 50 } 51 } 52``` 53 54The example above includes a basic, but usable subset of all possible JsonNode types - if your YAML needs them, than you should also consider the others i.e. `byte`, `byte[]`, `raw`, `short`, `long`, `float`, `double`, `BigInteger`, `BigDecimal` 55 56There are some important other things to note from the example: 57 58* Even in a reduced set, `ObjectNode` and `NullNode` should be included 59* The current return for methods that receive a null parameter value seems to be null rather than `NullNode` (based on inspecting the underlying `valueOf()` methods in the various `JsonNode` sub classes). Hence the implementation of the `textNode()` method above. 60 61The actual work here is really being done by the YAMLParser - it holds the location of the token being parsed, and the current location in the file. The first of these gives us a line and column number we can use to flag where an error or problem was found, and the second (if needed) can let us calculate a span to the end of the error e.g. if we wanted to highlight or underline the text in error. 62 63### Scenario 1, solution part 2 - augmented `JsonNode` subclassess 64 65We can be as simple or fancy as we like in the `JsonNode` subclassses, but basically we need 2 pieces of information from them: 66 67* An interface so when we are post processing the JSON tree, we can recognize nodes that retain line number information 68* An interface that lets us extract the relevant location information 69 70Those could be the same thing of course, but in our case we separated them as shown in the following example 71 72```java 73 public interface LocationProvider 74 { 75 LocationDetails getLocationDetails(); 76 } 77 78 public interface LocationDetails 79 { 80 default int getLineNumber() { return 1; } 81 default int getColumnNumber() { return 1; } 82 default String getFilename() { return ""; } 83 } 84 85 public static class LocationDetailsImpl implements LocationDetails 86 { 87 final JsonLocation currentLocation; 88 final JsonLocation tokenLocation; 89 90 public LocationDetailsImpl(JsonLocation tokenLocation, JsonLocation currentLocation) 91 { 92 this.tokenLocation = tokenLocation; 93 this.currentLocation = currentLocation; 94 } 95 96 @Override 97 public int getLineNumber() { return (tokenLocation != null) ? tokenLocation.getLineNr() : 1; }; 98 @Override 99 public int getColumnNumber() { return (tokenLocation != null) ? tokenLocation.getColumnNr() : 1; }; 100 @Override 101 public String getFilename() { return (tokenLocation != null) ? tokenLocation.getSourceRef().toString() : ""; }; 102 } 103 104 public static class MyNullNode extends NullNode implements LocationProvider 105 { 106 final LocationDetails locDetails; 107 108 public MyNullNode(JsonLocation tokenLocation, JsonLocation currentLocation) 109 { 110 super(); 111 locDetails = new LocationDetailsImpl(tokenLocation, currentLocation); 112 } 113 114 @Override 115 public LocationDetails getLocationDetails() 116 { 117 return locDetails; 118 } 119 } 120 121 public static class MyTextNode extends TextNode implements LocationProvider 122 { 123 final LocationDetails locDetails; 124 125 public MyTextNode(String v, JsonLocation tokenLocation, JsonLocation currentLocation) 126 { 127 super(v); 128 locDetails = new LocationDetailsImpl(tokenLocation, currentLocation); 129 } 130 131 @Override 132 public LocationDetails getLocationDetails() { return locDetails;} 133 } 134 135 public static class MyIntNode extends IntNode implements LocationProvider 136 { 137 final LocationDetails locDetails; 138 139 public MyIntNode(int v, JsonLocation tokenLocation, JsonLocation currentLocation) 140 { 141 super(v); 142 locDetails = new LocationDetailsImpl(tokenLocation, currentLocation); 143 } 144 145 @Override 146 public LocationDetails getLocationDetails() { return locDetails;} 147 } 148 149 public static class MyBooleanNode extends BooleanNode implements LocationProvider 150 { 151 final LocationDetails locDetails; 152 153 public MyBooleanNode(boolean v, JsonLocation tokenLocation, JsonLocation currentLocation) 154 { 155 super(v); 156 locDetails = new LocationDetailsImpl(tokenLocation, currentLocation); 157 } 158 159 @Override 160 public LocationDetails getLocationDetails() { return locDetails;} 161 } 162 163 public static class MyArrayNode extends ArrayNode implements LocationProvider 164 { 165 final LocationDetails locDetails; 166 167 public MyArrayNode(JsonNodeFactory nc, JsonLocation tokenLocation, JsonLocation currentLocation) 168 { 169 super(nc); 170 locDetails = new LocationDetailsImpl(tokenLocation, currentLocation); 171 } 172 173 @Override 174 public LocationDetails getLocationDetails() { return locDetails;} 175 } 176 177 public static class MyObjectNode extends ObjectNode implements LocationProvider 178 { 179 final LocationDetails locDetails; 180 181 public MyObjectNode(JsonNodeFactory nc, JsonLocation tokenLocation, JsonLocation currentLocation) 182 { 183 super(nc); 184 locDetails = new LocationDetailsImpl(tokenLocation, currentLocation); 185 } 186 187 @Override 188 public LocationDetails getLocationDetails() { return locDetails;} 189 } 190``` 191 192### Scenario 1, solution part 3 - using the custom `JsonNodeFactory` 193 194With the pieces we now have, we just need to tell the YAML library to make of use them, which involves a minor and simple modification to the normal sequence of processing. 195 196```java 197 this.yamlFactory = new YAMLFactory(); 198 199 try (YAMLParser yp = yamlFactory.createParser(f);) 200 { 201 ObjectReader rdr = mapper.reader(new MyNodeFactory(yp)); 202 JsonNode jsonNode = rdr.readTree(yp); 203 Set<ValidationMessage> msgs = mySchema.validate(jsonNode); 204 205 if (msgs.isEmpty()) 206 { 207 for (JsonNode item : jsonNode.get("someItem")) 208 { 209 processJsonItems(item); 210 } 211 } 212 else 213 { 214 // ... we'll look at how to get line locations for ValidationMessage cases in Scenario 2 215 } 216 217 } 218 // a JsonProcessingException seems to be the base exception for "gross" errors e.g. 219 // missing quotes at end of string etc. 220 catch (JsonProcessingException jpEx) 221 { 222 JsonLocation loc = jpEx.getLocation(); 223 // ... do something with the loc details 224 } 225``` 226Some notes on what is happening here: 227 228* We instantiate our custom JsonNodeFactory with the YAMLParser reference, and the line locations get recorded for us as the file is parsed. 229* If any exceptions are thrown, they will already contain a JsonLocation object that we can use directly if needed 230* If we get no validation messages, we know the JSON tree matches the schema and we can do any post processing we need on the tree. We'll see how to report any issues with this in the next part 231* We'll look at how to get line locations for ValidationMessage errors in Scenario 2 232 233### Scenario 1, solution part 4 - extracting the line details 234 235Having got everything prepared, actually getting the line locations is rather easy 236 237 238```java 239 void processJsonItems(JsonNode item) 240 { 241 Iterator<Map.Entry<String, JsonNode>> iter = item.fields(); 242 243 while (iter.hasNext()) 244 { 245 Map.Entry<String, JsonNode> node = iter.next(); 246 extractErrorLocation(node.getValue()); 247 } 248 } 249 250 void extractErrorLocation(JsonNode node) 251 { 252 if (node == null || !(node instanceof LocationProvider)) { return; } 253 254 //Note: we also know the "span" of the error section i.e. from token location to current location (first char after the token) 255 // if we wanted at some stage we could use this to highlight/underline all of the text in error 256 LocationDetails dets = ((LocationProvider) node).getLocationDetails(); 257 // ... do something with the details e.g. report an error/issue against the YAML line 258 } 259``` 260 261So that's pretty much it - as we are processing the JSON tree, if there is any point we want to report something about the contents, we can do so with a reference back to the original YAML line number. 262 263There is still a problem though, what if the validation against the schema fails? 264 265## Scenario 2 - ValidationMessage line locations 266 267Any failures validation against the schema come back in the form of a set of `ValidationMessage` objects. But these also do not contain original YAML source line information, and there's no easy way to inject it as we did for Scenario 1. Luckily though, there is a trick we can use here! 268 269Within the `ValidationMessage` object is something called the 'path' of the error, which we can access with the `getPath()` method. The syntax of this path by default is close to being [JSONPath](https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/), but can be set explicitly to be 270either [JSONPath](https://datatracker.ietf.org/doc/draft-ietf-jsonpath-base/) or [JSONPointer](https://www.rfc-editor.org/rfc/rfc6901.html) expressions. In our case as we already use [Jackson](https://github.com/FasterXML/jackson) which supports node lookups based on JSONPointer expressions, 271we will set the path expressions to be JSONPointers. This is achieved by configuring the reported path type through the `SchemaValidatorsConfig` before we read our schema: 272 273```java 274 SchemaValidatorsConfig config = new SchemaValidatorsConfig(); 275 config.setPathType(PathType.JSON_POINTER); 276 JsonSchema jsonSchema = JsonSchemaFactory.getInstance().getSchema(schema, config); 277``` 278 279Having set paths to be JSONPointer expressions we can use those pointers for locating the appropriate `JsonNode` instances. The following couple of methods illustrate this process: 280 281```java 282 JsonNode findJsonNode(ValidationMessage msg, JsonNode rootNode) 283 { 284 // Construct the JSONPointer. 285 JsonPointer pathPtr = JsonPointer.valueOf(msg.getPath()); 286 // Now see if we can find the node. 287 JsonNode node = rootNode.at(pathPtr); 288 return node; 289 } 290 291 LocationDetails getLocationDetails(ValidationMessage msg, JsonNode rootNode) 292 { 293 LocationDetails retval = null; 294 JsonNode node = findJsonNode(msg, rootNode); 295 if (node != null && node instanceof LocationProvider) 296 { 297 retval = ((LocationProvider) node).getLocationDetails(); 298 } 299 return retval; 300 } 301``` 302 303## Summary 304 305Although not trivial, the steps outlined here give us a way to track back to the original source YAML for a variety of possible reporting cases: 306 307* JSON processing exceptions (mostly already done for us) 308* Issues flagged during validation of the YAML against the schema 309* Anything we need to report with source information during post processing of the validated JSON tree 310