Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add XML document builder and eventifier with scala-xml implementation #328

Merged
merged 6 commits into from
Jun 9, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .sbtopts
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
-J-Xms2g
-J-Xmx4g
-J-XX:MaxMetaspaceSize=512m
25 changes: 22 additions & 3 deletions build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ val commonSettings = List(
}
},
organization := "org.gnieh",
headerLicense := Some(HeaderLicense.ALv2("2021", "Lucas Satabin")),
headerLicense := Some(HeaderLicense.ALv2("2022", "Lucas Satabin")),
licenses += ("The Apache Software License, Version 2.0" -> url("https://www.apache.org/licenses/LICENSE-2.0.txt")),
homepage := Some(url("https://github.com/satabin/fs2-data")),
versionScheme := Some("early-semver"),
Expand All @@ -50,7 +50,7 @@ val commonSettings = List(
case Some((2, n)) if n < 13 =>
List("-Ypartial-unification", "-language:higherKinds")
case Some((3, _)) =>
List("-Ykind-projector")
List("-Ykind-projector", "-source:future-migration", "-no-indent")
}
.toList
.flatten,
Expand Down Expand Up @@ -118,7 +118,8 @@ val root = (project in file("."))
jsonDiffson.js,
jsonPlay.js,
text.js,
xml.js),
xml.js,
scalaXml.js),
ScalaUnidoc / siteSubdirName := "api",
addMappingsToSiteDir(ScalaUnidoc / packageDoc / mappings, ScalaUnidoc / siteSubdirName),
Nanoc / sourceDirectory := file("site"),
Expand All @@ -142,6 +143,8 @@ val root = (project in file("."))
jsonInterpolators.js,
xml.jvm,
xml.js,
scalaXml.jvm,
scalaXml.js,
cbor.jvm,
cbor.js
)
Expand Down Expand Up @@ -305,6 +308,21 @@ lazy val xml = crossProject(JVMPlatform, JSPlatform)
)
.dependsOn(text)

lazy val scalaXml = crossProject(JVMPlatform, JSPlatform)
.crossType(CrossType.Pure)
.in(file("xml/scala-xml"))
.settings(commonSettings)
.settings(publishSettings)
.settings(
name := "fs2-data-xml-scala",
description := "Support for Scala XML ASTs",
libraryDependencies += "org.scala-lang.modules" %%% "scala-xml" % "2.1.0"
)
.jsSettings(
scalaJSLinkerConfig ~= (_.withModuleKind(ModuleKind.CommonJSModule))
)
.dependsOn(xml % "compile->compile;test->test")

lazy val cbor = crossProject(JVMPlatform, JSPlatform)
.crossType(CrossType.Full)
.in(file("cbor"))
Expand Down Expand Up @@ -345,6 +363,7 @@ lazy val documentation = project
jsonCirce.jvm,
jsonInterpolators.jvm,
xml.jvm,
scalaXml.jvm,
cbor.jvm)

lazy val benchmarks = project
Expand Down
2 changes: 1 addition & 1 deletion documentation/docs/json/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ The `filter` preserves the chunk structure, so that the stream fails as soon as

To handle Json ASTs, you can use the types and pipes available in the `fs2.data.json.ast` package.

JSON ASTs can be built if you provider an implicit [`Builder[Json]`][builder-api] to the `values` pipe. The `Builder[Json]` typeclass describes how JSON ASTs of type `Json` are built from streams.
JSON ASTs can be built if you provide an implicit [`Builder[Json]`][builder-api] to the `values` pipe. The `Builder[Json]` typeclass describes how JSON ASTs of type `Json` are built from streams.

```scala mdoc:compile-only
import ast._
Expand Down
34 changes: 33 additions & 1 deletion documentation/docs/xml/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ val input = """<a xmlns:ns="http://test.ns">
| test entity resolution &amp; normalization
|</a>""".stripMargin

val stream = Stream.emit(input).through(events[IO, String])
val stream = Stream.emit(input).through(events[IO, String]())
stream.compile.toList.unsafeRunSync()
```

Expand Down Expand Up @@ -63,3 +63,35 @@ Once entites and namespaces are resolved, the events might be numerous and can b
val normalized = entityResolved.through(normalize)
normalized.compile.toList.unsafeRunSync()
```

### DOM builder and eventifier

To handle XML DOM, you can use the types and pipes available in the `fs2.data.xml.dom` package.

XML DOM can be built if you provide an implicit [`DocumentBuilder[Doc]`][builder-api] to the `documents` pipe. The `DocumentBuilder[Doc]` typeclass describes how XML DOM of type `Doc` are built from an XML event stream.

```scala mdoc:compile-only
import dom._

trait SomeDocType

implicit val builder: DocumentBuilder[SomeDocType] = ???
stream.through(documents[IO, SomeDocType])
```

Conversely, the pipe transforming a stream of `Doc`s into a stream of XML events is called `eventify` and requires an implicit [`DocumentEventifier[Doc]`][eventifier-api] in scope.

```scala mdoc:compile-only
import dom._

trait SomeDocType

implicit val builder: DocumentBuilder[SomeDocType] = ???
implicit val eventifier: DocumentEventifier[SomeDocType] = ???

stream.through(documents[IO, SomeDocType])
.through(eventify[IO, SomeDocType])
```

[builder-api]: /api/fs2/data/xml/dom/DocumentBuilder.html
[eventifier-api]: /api/fs2/data/xml/dom/DocumentEventifier.html
41 changes: 41 additions & 0 deletions documentation/docs/xml/libraries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
title: XML Libraries
index: 1
module: xml
---

Bindings to new Scala XML DOM librariy can be added by implementing the `DocumentBuilder` and `DocumentEventifier` traites. `fs2-data` provides some of them out of the box.

This page covers the following libraries:
* Contents
{:toc}

### `scala-xml`

Module: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-xml-scala_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-xml-scala_2.13)

The `fs2-data-xml-scala` module provides `DocumentBuilder` and `DocumentEventifier` instances for the [scala-xml][scala-xml] `Document` type.

The `documents` Pipe emits `scala.xml.Document`s.

```scala mdoc:to-string
import fs2.{Fallible, Stream}
import fs2.data.xml._
import fs2.data.xml.dom._
import fs2.data.xml.scalaXml._

val input = """<?xml version="1.1" ?>
|<root attr1="value1" attr2="value2">
| <!-- a comment -->
| Some text &amp; a child.
| <nested>With text</nested>
|</root>""".stripMargin

val evts = Stream.emits(input)
.through(events[Fallible, Char]())
.through(documents)

evts.compile.toList
```

[scala-xml]: https://github.com/scala/scala-xml
12 changes: 8 additions & 4 deletions site/content/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,14 @@ title: Home
Artefacts are published on maven, use your favorite build tool to bring it into your project.
Following modules are available:
- `fs2-data-json`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json_2.13) A JSON parser and manipulation library
- `fs2-data-json-circe`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-circe_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-circe_2.13) [circe][circe] support for parsed JSON.
- `fs2-data-json-interpolators`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-interpolators_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-interpolators_2.13) [literally][literally] support for statically checked JSON interpolators.
- `fs2-data-json-diffson`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-diffson_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-diffson_2.13) [diffson][diffson] support for patching JSON streams.
- `fs2-data-json-circe`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-circe_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-circe_2.13) [circe][circe] support for parsed JSON.
- `fs2-data-json-play`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-play_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-play_2.13) [Play! JSON][play-json] support for parsed JSON.
- `fs2-data-json-interpolators`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-interpolators_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-interpolators_2.13) [literally][literally] support for statically checked JSON interpolators.
- `fs2-data-json-diffson`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-json-diffson_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-json-diffson_2.13) [diffson][diffson] support for patching JSON streams.
- `fs2-data-xml`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-xml_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-xml_2.13) An XML parser
- `fs2-data-xml-scala`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-xml-scala_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-xml-scala_2.13) [scala-xml][scala-xml] support for XML DOM.
- `fs2-data-csv`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-csv_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-csv_2.13) A CSV parser
- `fs2-data-csv-generic`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-csv-generic_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-csv-generic_2.13) generic decoder for CSV files
- `fs2-data-csv-generic`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-csv-generic_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-csv-generic_2.13) generic decoder for CSV files
- `fs2-data-cbor`: [![Maven Central](https://img.shields.io/maven-central/v/org.gnieh/fs2-data-cbor_2.13.svg)](https://mvnrepository.com/artifact/org.gnieh/fs2-data-cbor_2.13) CBOR parser and trasformation


Expand All @@ -27,5 +29,7 @@ Following modules are available:
[cats-friendly-logo]: https://typelevel.org/cats/img/cats-badge-tiny.png
[fs2]: https://fs2.io
[circe]: https://circe.github.io/circe/
[play-json]: https://www.playframework.com/documentation/latest/ScalaJson
[diffson]: https://github.com/gnieh/diffson
[literally]: https://github.com/typelevel/literally
[scala-xml]: https://github.com/scala/scala-xml
118 changes: 118 additions & 0 deletions xml/scala-xml/src/main/scala/fs2/data/xml/scalaXml/package.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
/*
* Copyright 2022 Lucas Satabin
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package fs2
package data
package xml

import dom.{DocumentBuilder, DocumentEventifier}

import cats.syntax.all._

import scala.xml._

package object scalaXml {

implicit object ScalaXmlBuilder extends DocumentBuilder[Document] {

type Content = Node
type Elem = scala.xml.Elem
type Misc = Node

def makeDocument(version: Option[String],
encoding: Option[String],
standalone: Option[Boolean],
doctype: Option[XmlEvent.XmlDoctype],
prolog: List[Misc],
root: Elem): Document = {
val document = new Document()
document.version = version
document.encoding = encoding
document.standAlone = standalone
document.children = prolog :+ root
document.docElem = root.head
document
}

def makeComment(content: String): Option[Misc] =
Comment(content).some

def makeText(texty: XmlEvent.XmlTexty): Content =
texty match {
case XmlEvent.XmlCharRef(value) => Text(new String(Character.toChars(value)))
case XmlEvent.XmlEntityRef(name) => EntityRef(name)
case XmlEvent.XmlString(s, false) => Text(s)
case XmlEvent.XmlString(s, true) => PCData(s)
}

def makeElement(name: QName, attributes: List[Attr], isEmpty: Boolean, children: List[Content]): Elem = {
val attrs = attributes.foldRight(Null: MetaData) { (attr, acc) =>
attr.name.prefix match {
case Some(prefix) => new PrefixedAttribute(prefix, attr.name.local, attr.value.map(makeText(_)), acc)
case None => new UnprefixedAttribute(attr.name.local, attr.value.map(makeText(_)), acc)
}
}
Elem(name.prefix.getOrElse(null), name.local, attrs, TopScope, isEmpty, children: _*)
}

def makePI(target: String, content: String): Misc =
ProcInstr(target, content)

}

implicit object ScalaXmlEventifier extends DocumentEventifier[Document] {

def eventify(node: Document): Stream[Pure, XmlEvent] =
innerEventify(node)

def innerEventify(node: NodeSeq): Stream[Pure, XmlEvent] =
node match {
case Comment(comment) => Stream.emit(XmlEvent.Comment(comment))
case PCData(content) => Stream.emit(XmlEvent.XmlString(content, true))
case Text(content) => Stream.emit(XmlEvent.XmlString(content, false))
case EntityRef(name) => Stream.emit(XmlEvent.XmlEntityRef(name))
case ProcInstr(target, content) => Stream.emit(XmlEvent.XmlPI(target, content))
case e: Elem =>
val isEmpty = e.minimizeEmpty && e.child.isEmpty
val name = QName(Option(e.prefix), e.label)
Stream.emit(XmlEvent.StartTag(name, makeAttributes(e.attributes, Nil), isEmpty)) ++ Stream
.emits(e.child)
.flatMap(innerEventify(_)) ++ Stream.emit(XmlEvent.EndTag(name))
case doc: Document =>
Stream.emit(XmlEvent.StartDocument) ++ Stream.emits(
doc.version.map(version => XmlEvent.XmlDecl(version, doc.encoding, doc.standAlone)).toSeq) ++ Stream
.emits(doc.children)
.flatMap(innerEventify(_)) ++ Stream.emit(XmlEvent.EndDocument)
case Group(children) => Stream.emits(children).flatMap(innerEventify(_))
case other => Stream.empty
}

private def makeTexty(nodes: List[Node]): List[XmlEvent.XmlTexty] =
nodes.collect {
case Text(s) => XmlEvent.XmlString(s, false)
case EntityRef(name) => XmlEvent.XmlEntityRef(name)
}

private def makeAttributes(md: MetaData, acc: List[Attr]): List[Attr] =
md match {
case Null => acc.reverse
case PrefixedAttribute(prefix, key, value, next) =>
makeAttributes(next, Attr(QName(Option(prefix), key), makeTexty(value.toList)) :: acc)
case UnprefixedAttribute(key, value, next) =>
makeAttributes(next, Attr(QName(key), makeTexty(value.toList)) :: acc)
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Copyright 2022 Lucas Satabin
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package fs2
package data
package xml
package scalaXml

import dom._

import scala.xml.Document

object ScalaXmlEventifierSpec extends EventifierSpec[Document]
2 changes: 1 addition & 1 deletion xml/src/main/scala/fs2/data/xml/XmlException.scala
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ package fs2
package data
package xml

class XmlException(val error: XmlError, msg: String) extends Exception(msg)
case class XmlException(val error: XmlError, msg: String) extends Exception(msg)
42 changes: 42 additions & 0 deletions xml/src/main/scala/fs2/data/xml/dom/DocumentBuilder.scala
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Copyright 2022 Lucas Satabin
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package fs2
package data
package xml
package dom

trait DocumentBuilder[Document] {

type Content
type Misc <: Content
type Elem <: Content

def makeDocument(version: Option[String],
encoding: Option[String],
standalone: Option[Boolean],
doctype: Option[XmlEvent.XmlDoctype],
prolog: List[Misc],
root: Elem): Document

def makeComment(content: String): Option[Misc]

def makeText(texty: XmlEvent.XmlTexty): Content

def makeElement(name: QName, attributes: List[Attr], isEmpty: Boolean, children: List[Content]): Elem

def makePI(target: String, content: String): Misc

}
Loading