Implement a `rustdoc_include` preprocessor (#1003)

* Allow underscores in the link type name

* Add some tests for include anchors

* Include parts of Rust files and hide the rest

Fixes #618.

* Increase min supported Rust version to 1.35

* Add a test for a behavior of rustdoc_include I want to depend on

At first I thought this was a bug, but then I looked at some use cases
we have in TRPL and decided this was a feature that I'd like to use.
This commit is contained in:
Carol (Nichols || Goulding) 2019-10-05 18:27:03 -04:00 committed by Dylan DPC
parent 8cdeb121c5
commit ac1749ff2f
14 changed files with 2000 additions and 354 deletions

View File

@ -17,7 +17,7 @@ matrix:
env: TARGET=x86_64-unknown-linux-gnu
- rust: nightly
env: TARGET=x86_64-unknown-linux-gnu
- rust: 1.34.0 # Minimum required Rust version
- rust: 1.35.0 # Minimum required Rust version
env: TARGET=x86_64-unknown-linux-gnu
- rust: stable

View File

@ -41,7 +41,7 @@ There are multiple ways to install mdBook.
2. **From Crates.io**
This requires at least [Rust] 1.34 and Cargo to be installed. Once you have installed
This requires at least [Rust] 1.35 and Cargo to be installed. Once you have installed
Rust, type the following in the terminal:
```

View File

@ -81,7 +81,7 @@ This controls the build process of your book.
The following preprocessors are available and included by default:
- `links`: Expand the `{{ #playpen }}` and `{{ #include }}` handlebars
- `links`: Expand the `{{ #playpen }}`, `{{ #include }}`, and `{{ #rustdoc_include }}` handlebars
helpers in a chapter to include the contents of a file.
- `index`: Convert all chapter files named `README.md` into `index.md`. That is
to say, all `README.md` would be rendered to an index file `index.html` in the

View File

@ -3,7 +3,9 @@
## Hiding code lines
There is a feature in mdBook that lets you hide code lines by prepending them
with a `#`.
with a `#` [in the same way that Rustdoc does][rustdoc-hide].
[rustdoc-hide]: https://doc.rust-lang.org/stable/rustdoc/documentation-tests.html#hiding-portions-of-the-example
```bash
# fn main() {
@ -107,6 +109,70 @@ This is the full file.
Lines containing anchor patterns inside the included anchor are ignored.
## Including a file but initially hiding all except specified lines
The `rustdoc_include` helper is for including code from external Rust files that contain complete
examples, but only initially showing particular lines specified with line numbers or anchors in the
same way as with `include`.
The lines not in the line number range or between the anchors will still be included, but they will
be prefaced with `#`. This way, a reader can expand the snippet to see the complete example, and
Rustdoc will use the complete example when you run `mdbook test`.
For example, consider a file named `file.rs` that contains this Rust program:
```rust
fn main() {
let x = add_one(2);
assert_eq!(x, 3);
}
fn add_one(num: i32) -> i32 {
num + 1
}
```
We can include a snippet that initially shows only line 2 by using this syntax:
````hbs
To call the `add_one` function, we pass it an `i32` and bind the returned value to `x`:
```rust
\{{#rustdoc_include file.rs:2}}
```
````
This would have the same effect as if we had manually inserted the code and hidden all but line 2
using `#`:
````hbs
To call the `add_one` function, we pass it an `i32` and bind the returned value to `x`:
```rust
# fn main() {
let x = add_one(2);
# assert_eq!(x, 3);
# }
#
# fn add_one(num: i32) -> i32 {
# num + 1
#}
```
````
That is, it looks like this (click the "expand" icon to see the rest of the file):
```rust
# fn main() {
let x = add_one(2);
# assert_eq!(x, 3);
# }
#
# fn add_one(num: i32) -> i32 {
# num + 1
#}
```
## Inserting runnable Rust files
With the following syntax, you can insert runnable Rust files into your book:

View File

@ -1,5 +1,8 @@
use crate::errors::*;
use crate::utils::{take_anchored_lines, take_lines};
use crate::utils::{
take_anchored_lines, take_lines, take_rustdoc_include_anchored_lines,
take_rustdoc_include_lines,
};
use regex::{CaptureMatches, Captures, Regex};
use std::fs;
use std::ops::{Bound, Range, RangeBounds, RangeFrom, RangeFull, RangeTo};
@ -11,8 +14,15 @@ use crate::book::{Book, BookItem};
const ESCAPE_CHAR: char = '\\';
const MAX_LINK_NESTED_DEPTH: usize = 10;
/// A preprocessor for expanding the `{{# playpen}}` and `{{# include}}`
/// helpers in a chapter.
/// A preprocessor for expanding helpers in a chapter. Supported helpers are:
///
/// - `{{# include}}` - Insert an external file of any type. Include the whole file, only particular
///. lines, or only between the specified anchors.
/// - `{{# rustdoc_include}}` - Insert an external Rust file, showing the particular lines
///. specified or the lines between specified anchors, and include the rest of the file behind `#`.
/// This hides the lines from initial display but shows them when the reader expands the code
/// block and provides them to Rustdoc for testing.
/// - `{{# playpen}}` - Insert runnable Rust files
#[derive(Default)]
pub struct LinkPreprocessor;
@ -104,6 +114,7 @@ enum LinkType<'a> {
Escaped,
Include(PathBuf, RangeOrAnchor),
Playpen(PathBuf, Vec<&'a str>),
RustdocInclude(PathBuf, RangeOrAnchor),
}
#[derive(PartialEq, Debug, Clone)]
@ -172,6 +183,7 @@ impl<'a> LinkType<'a> {
LinkType::Escaped => None,
LinkType::Include(p, _) => Some(return_relative_path(base, &p)),
LinkType::Playpen(p, _) => Some(return_relative_path(base, &p)),
LinkType::RustdocInclude(p, _) => Some(return_relative_path(base, &p)),
}
}
}
@ -222,6 +234,15 @@ fn parse_include_path(path: &str) -> LinkType<'static> {
LinkType::Include(path, range_or_anchor)
}
fn parse_rustdoc_include_path(path: &str) -> LinkType<'static> {
let mut parts = path.splitn(2, ':');
let path = parts.next().unwrap().into();
let range_or_anchor = parse_range_or_anchor(parts.next());
LinkType::RustdocInclude(path, range_or_anchor)
}
#[derive(PartialEq, Debug, Clone)]
struct Link<'a> {
start_index: usize,
@ -241,6 +262,7 @@ impl<'a> Link<'a> {
match (typ.as_str(), file_arg) {
("include", Some(pth)) => Some(parse_include_path(pth)),
("playpen", Some(pth)) => Some(LinkType::Playpen(pth.into(), props)),
("rustdoc_include", Some(pth)) => Some(parse_rustdoc_include_path(pth)),
_ => None,
}
}
@ -281,6 +303,26 @@ impl<'a> Link<'a> {
)
})
}
LinkType::RustdocInclude(ref pat, ref range_or_anchor) => {
let target = base.join(pat);
fs::read_to_string(&target)
.map(|s| match range_or_anchor {
RangeOrAnchor::Range(range) => {
take_rustdoc_include_lines(&s, range.clone())
}
RangeOrAnchor::Anchor(anchor) => {
take_rustdoc_include_anchored_lines(&s, anchor)
}
})
.chain_err(|| {
format!(
"Could not read file for link {} ({})",
self.link_text,
target.display(),
)
})
}
LinkType::Playpen(ref pat, ref attrs) => {
let target = base.join(pat);
@ -326,7 +368,7 @@ fn find_links(contents: &str) -> LinkIter<'_> {
\\\{\{\#.*\}\} # match escaped link
| # or
\{\{\s* # link opening parens and whitespace
\#([a-zA-Z0-9]+) # link type
\#([a-zA-Z0-9_]+) # link type
\s+ # separating whitespace
([a-zA-Z0-9\s_.\-:/\\]+) # link target path and space separated properties
\s*\}\} # whitespace and link closing parens"

View File

@ -11,7 +11,10 @@ use std::borrow::Cow;
use std::fmt::Write;
use std::path::Path;
pub use self::string::{take_anchored_lines, take_lines};
pub use self::string::{
take_anchored_lines, take_lines, take_rustdoc_include_anchored_lines,
take_rustdoc_include_lines,
};
/// Replaces multiple consecutive whitespace characters with a single space character.
pub fn collapse_whitespace(text: &str) -> Cow<'_, str> {

View File

@ -18,33 +18,33 @@ pub fn take_lines<R: RangeBounds<usize>>(s: &str, range: R) -> String {
}
}
lazy_static! {
static ref ANCHOR_START: Regex = Regex::new(r"ANCHOR:\s*(?P<anchor_name>[\w_-]+)").unwrap();
static ref ANCHOR_END: Regex = Regex::new(r"ANCHOR_END:\s*(?P<anchor_name>[\w_-]+)").unwrap();
}
/// Take anchored lines from a string.
/// Lines containing anchor are ignored.
pub fn take_anchored_lines(s: &str, anchor: &str) -> String {
lazy_static! {
static ref RE_START: Regex = Regex::new(r"ANCHOR:\s*(?P<anchor_name>[\w_-]+)").unwrap();
static ref RE_END: Regex = Regex::new(r"ANCHOR_END:\s*(?P<anchor_name>[\w_-]+)").unwrap();
}
let mut retained = Vec::<&str>::new();
let mut anchor_found = false;
for l in s.lines() {
if anchor_found {
match RE_END.captures(l) {
match ANCHOR_END.captures(l) {
Some(cap) => {
if &cap["anchor_name"] == anchor {
break;
}
}
None => {
if !RE_START.is_match(l) {
if !ANCHOR_START.is_match(l) {
retained.push(l);
}
}
}
} else {
if let Some(cap) = RE_START.captures(l) {
if let Some(cap) = ANCHOR_START.captures(l) {
if &cap["anchor_name"] == anchor {
anchor_found = true;
}
@ -55,9 +55,70 @@ pub fn take_anchored_lines(s: &str, anchor: &str) -> String {
retained.join("\n")
}
/// Keep lines contained within the range specified as-is.
/// For any lines not in the range, include them but use `#` at the beginning. This will hide the
/// lines from initial display but include them when expanding the code snippet or testing with
/// rustdoc.
pub fn take_rustdoc_include_lines<R: RangeBounds<usize>>(s: &str, range: R) -> String {
let mut output = String::with_capacity(s.len());
for (index, line) in s.lines().enumerate() {
if !range.contains(&index) {
output.push_str("# ");
}
output.push_str(line);
output.push_str("\n");
}
output.pop();
output
}
/// Keep lines between the anchor comments specified as-is.
/// For any lines not between the anchors, include them but use `#` at the beginning. This will
/// hide the lines from initial display but include them when expanding the code snippet or testing
/// with rustdoc.
pub fn take_rustdoc_include_anchored_lines(s: &str, anchor: &str) -> String {
let mut output = String::with_capacity(s.len());
let mut within_anchored_section = false;
for l in s.lines() {
if within_anchored_section {
match ANCHOR_END.captures(l) {
Some(cap) => {
if &cap["anchor_name"] == anchor {
within_anchored_section = false;
}
}
None => {
if !ANCHOR_START.is_match(l) {
output.push_str(l);
output.push_str("\n");
}
}
}
} else {
if let Some(cap) = ANCHOR_START.captures(l) {
if &cap["anchor_name"] == anchor {
within_anchored_section = true;
}
} else if !ANCHOR_END.is_match(l) {
output.push_str("# ");
output.push_str(l);
output.push_str("\n");
}
}
}
output.pop();
output
}
#[cfg(test)]
mod tests {
use super::{take_anchored_lines, take_lines};
use super::{
take_anchored_lines, take_lines, take_rustdoc_include_anchored_lines,
take_rustdoc_include_lines,
};
#[test]
fn take_lines_test() {
@ -99,4 +160,93 @@ mod tests {
assert_eq!(take_anchored_lines(s, "test"), "dolor\nsit\namet");
assert_eq!(take_anchored_lines(s, "something"), "");
}
#[test]
fn take_rustdoc_include_lines_test() {
let s = "Lorem\nipsum\ndolor\nsit\namet";
assert_eq!(
take_rustdoc_include_lines(s, 1..3),
"# Lorem\nipsum\ndolor\n# sit\n# amet"
);
assert_eq!(
take_rustdoc_include_lines(s, 3..),
"# Lorem\n# ipsum\n# dolor\nsit\namet"
);
assert_eq!(
take_rustdoc_include_lines(s, ..3),
"Lorem\nipsum\ndolor\n# sit\n# amet"
);
assert_eq!(take_rustdoc_include_lines(s, ..), s);
// corner cases
assert_eq!(
take_rustdoc_include_lines(s, 4..3),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet"
);
assert_eq!(take_rustdoc_include_lines(s, ..100), s);
}
#[test]
fn take_rustdoc_include_anchored_lines_test() {
let s = "Lorem\nipsum\ndolor\nsit\namet";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet"
);
let s = "Lorem\nipsum\ndolor\nANCHOR_END: test\nsit\namet";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet"
);
let s = "Lorem\nipsum\nANCHOR: test\ndolor\nsit\namet";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\n# ipsum\ndolor\nsit\namet"
);
assert_eq!(
take_rustdoc_include_anchored_lines(s, "something"),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet"
);
let s = "Lorem\nipsum\nANCHOR: test\ndolor\nsit\namet\nANCHOR_END: test\nlorem\nipsum";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\n# ipsum\ndolor\nsit\namet\n# lorem\n# ipsum"
);
assert_eq!(
take_rustdoc_include_anchored_lines(s, "something"),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet\n# lorem\n# ipsum"
);
let s = "Lorem\nANCHOR: test\nipsum\nANCHOR: test\ndolor\nsit\namet\nANCHOR_END: test\nlorem\nipsum";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\nipsum\ndolor\nsit\namet\n# lorem\n# ipsum"
);
assert_eq!(
take_rustdoc_include_anchored_lines(s, "something"),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet\n# lorem\n# ipsum"
);
let s = "Lorem\nANCHOR: test2\nipsum\nANCHOR: test\ndolor\nsit\namet\nANCHOR_END: test\nlorem\nANCHOR_END:test2\nipsum";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test2"),
"# Lorem\nipsum\ndolor\nsit\namet\nlorem\n# ipsum"
);
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\n# ipsum\ndolor\nsit\namet\n# lorem\n# ipsum"
);
assert_eq!(
take_rustdoc_include_anchored_lines(s, "something"),
"# Lorem\n# ipsum\n# dolor\n# sit\n# amet\n# lorem\n# ipsum"
);
let s = "Lorem\nANCHOR: test\nipsum\nANCHOR_END: test\ndolor\nANCHOR: test\nsit\nANCHOR_END: test\namet";
assert_eq!(
take_rustdoc_include_anchored_lines(s, "test"),
"# Lorem\nipsum\n# dolor\nsit\n# amet"
);
}
}

View File

@ -52,7 +52,13 @@ impl DummyBook {
})?;
let sub_pattern = if self.passing_test { "true" } else { "false" };
let files_containing_tests = ["src/first/nested.md", "src/first/nested-test.rs"];
let files_containing_tests = [
"src/first/nested.md",
"src/first/nested-test.rs",
"src/first/nested-test-with-anchors.rs",
"src/first/partially-included-test.rs",
"src/first/partially-included-test-with-anchors.rs",
];
for file in &files_containing_tests {
let path_containing_tests = temp.path().join(file);
replace_pattern_in_file(&path_containing_tests, "$TEST_STATUS", sub_pattern)?;

View File

@ -0,0 +1,11 @@
// The next line will cause a `testing` test to fail if the anchor feature is broken in such a way
// that the whole file gets mistakenly included.
assert!(!$TEST_STATUS);
// ANCHOR: myanchor
// ANCHOR: unendinganchor
// The next line will cause a `rendered_output` test to fail if the anchor feature is broken in
// such a way that the content between anchors isn't included.
// unique-string-for-anchor-test
assert!($TEST_STATUS);
// ANCHOR_END: myanchor

View File

@ -11,3 +11,21 @@ assert!($TEST_STATUS);
```rust
{{#include nested-test.rs}}
```
## Anchors include the part of a file between special comments
```rust
{{#include nested-test-with-anchors.rs:myanchor}}
```
## Rustdoc include adds the rest of the file as hidden
```rust
{{#rustdoc_include partially-included-test.rs:5:7}}
```
## Rustdoc include works with anchors too
```rust
{{#rustdoc_include partially-included-test-with-anchors.rs:rustdoc-include-anchor}}
```

View File

@ -0,0 +1,11 @@
fn some_other_function() {
// ANCHOR: unused-anchor-that-should-be-stripped
assert!($TEST_STATUS);
// ANCHOR_END: unused-anchor-that-should-be-stripped
}
// ANCHOR: rustdoc-include-anchor
fn main() {
some_other_function();
}
// ANCHOR_END: rustdoc-include-anchor

View File

@ -0,0 +1,7 @@
fn some_function() {
assert!($TEST_STATUS);
}
fn main() {
some_function();
}

View File

@ -147,6 +147,32 @@ fn rendered_code_has_playpen_stuff() {
assert_contains_strings(book_js, &[".playpen"]);
}
#[test]
fn anchors_include_text_between_but_not_anchor_comments() {
let temp = DummyBook::new().build().unwrap();
let md = MDBook::load(temp.path()).unwrap();
md.build().unwrap();
let nested = temp.path().join("book/first/nested.html");
let text_between_anchors = vec!["unique-string-for-anchor-test"];
let anchor_text = vec!["ANCHOR"];
assert_contains_strings(nested.clone(), &text_between_anchors);
assert_doesnt_contain_strings(nested, &anchor_text);
}
#[test]
fn rustdoc_include_hides_the_unspecified_part_of_the_file() {
let temp = DummyBook::new().build().unwrap();
let md = MDBook::load(temp.path()).unwrap();
md.build().unwrap();
let nested = temp.path().join("book/first/nested.html");
let text = vec!["# fn some_function() {", "# fn some_other_function() {"];
assert_contains_strings(nested, &text);
}
#[test]
fn chapter_content_appears_in_rendered_document() {
let content = vec![

File diff suppressed because it is too large Load Diff