Building on last month’s post, you can GET arxiv figure data using the following curl, where of course the query string must correspond to data in our database:
curl -k ‘https://plot2txt-staging.us-east-1.elasticbeanstalk.com/record?type=figure&source=astro&label=temp’
Here we’re requesting figures from arxiv PDFs with ‘astro’ in the source url, and one or more labels must contain the substring ‘temp’. Responses are limited to 5 db items per query, but you can iterate through the db by using the last high resolution timestamp in a response as the starting key for the next query ala:
curl -k ‘https://plot2txt-staging.us-east-1.elasticbeanstalk.com/record?type=figure&source=astro&label=temp&skey=1526848905565’
Besides the source arxiv URL eg., “https://arxiv.org/abs/astro-ph/0007095”, you will receive the figure with the detected substring, as well as a fit to the pixels using mixture modeling. Where possible, numerical data from the axes is used to invert the pixel positions.