Skip to contents

next_result() sends exactly one request to the server.

next_batch() requests results from the server until data is complete the latest batch of pages in the result.

retrieve_all() keeps requesting data until all the pages from the query have been returned.

Usage

next_result(x)

next_batch(x)

retrieve_all(x)

Arguments

x

The query. Either a wiki_action_request or a query_tbl.

Value

A query_tbl containing results of the query. If x is a query_tbl, then the function will return a new data with the new data appended to it. If x is a wiki_action_request, then the returned query_tbl will contain the necessary data to supply future calls to next_result(), next_batch() or retrieve_all().

Details

It is rare that a query can be fulfilled in a single request to the server. There are two ways a query can be incomplete. All queries return a list of pages as their result. The result may be incomplete because not all the data for each page has been returned. In this case the batch is incomplete. Or the data may be complete for all pages, but there are more pages available on the server. In this case the query can be continued. Thus the three functions for next_result(), next_batch() and retrieve_all().

Examples

# Try out a request using next_result(), then retrieve the rest of the
# results. The clllimt limits the first request to 40 results.
preview <- wiki_action_request() %>%
  query_by_title("Steve Wozniak") %>%
  query_page_properties("categories", cllimit = 40) %>%
  gracefully(next_result)
preview
#> <incomplete/query_tbl>
#>  There are more results on the server. Retrieve them with `next_batch()` or `retrieve_all()`
#> ! Data not fully downloaded for last batch. Retrieve it with `next_batch()` or `retrieve_all()`.
#> # A tibble: 1 × 4
#>   pageid    ns title         categories       
#>    <int> <int> <chr>         <list>           
#> 1  27848     0 Steve Wozniak <tibble [40 × 2]>

all_results <- preview %>%
  gracefully(retrieve_all)
all_results
#> <final/query_tbl>
#>  All results downloaded from server
#>  Data complete for all records
#> # A tibble: 1 × 4
#>   pageid    ns title         categories       
#>    <int> <int> <chr>         <list>           
#> 1  27848     0 Steve Wozniak <tibble [80 × 2]>

# tidyr is useful for list-columns.
if (tibble::is_tibble(all_results)) {
  all_results %>%
    tidyr::unnest(cols=c(categories), names_sep = "_")
}
#> # A tibble: 80 × 5
#>    pageid    ns title         categories_ns categories_title                    
#>     <int> <int> <chr>                 <int> <chr>                               
#>  1  27848     0 Steve Wozniak            14 Category:1950 births                
#>  2  27848     0 Steve Wozniak            14 Category:20th-century American busi…
#>  3  27848     0 Steve Wozniak            14 Category:20th-century American engi…
#>  4  27848     0 Steve Wozniak            14 Category:20th-century American inve…
#>  5  27848     0 Steve Wozniak            14 Category:21st-century American busi…
#>  6  27848     0 Steve Wozniak            14 Category:21st-century American engi…
#>  7  27848     0 Steve Wozniak            14 Category:21st-century American inve…
#>  8  27848     0 Steve Wozniak            14 Category:Academic staff of the Univ…
#>  9  27848     0 Steve Wozniak            14 Category:Amateur radio people       
#> 10  27848     0 Steve Wozniak            14 Category:American Freemasons        
#> # ℹ 70 more rows