I had never thought of a use for Brett Terpstra’s Marky the Markdownifier before listening today’s Systematic. Why would I want to turn a webpage into Markdown?
When I heard that Marky has an API, I was inspired. Pinboard has a “description” field that allows up to 65,000 characters. I never know what to put in this box. Wouldn’t it be great to put the full content of the page in Markdown into this field?
I set out to write a quick Python script to:
- Grab recent Pinboard links.
- Check to see if the URLs still resolve.
- Send the link to Marky and collect a Markdown version of the content.
- Post an updated link to Pinboard with the Markdown in the description field.
If all went well, I would release this script on Github as Pindown, a great way to put Markdown page content into your Pinboard links.
The script below is far from well-constructed. I would have spent more time cleaning it up with things like better error handling and a more complete CLI to give more granular control over which links receive Markdown content.
Unfortunately, I found that Pinboard consistently returns a 414 error code because the URLs are too long. Why is this a problem? Pinboard, in an attempt to maintain compatibility with the del.ico.us API uses only GET requests, whereas this kind of request would typically use a POST end point. As a result, I cannot send along a data payload.
So I’m sharing this just for folks who are interested in playing with Python, RESTful APIs, and Pinboard. I’m also posting for my own posterity since a non-Del.ico.us compatible version 2 of the Pinboard API is coming.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
|
import requests
import json
import yaml
def getDataSet(call):
r = requests.get('[api.pinboard.in/v1/posts/...](https://api.pinboard.in/v1/posts/recent') + call)
data_set = json.loads(r._content)
return data_set
def checkURL(url=""):
newurl = requests.get(url)
if newurl.status_code==200:
return newurl.url
else:
raise ValueError('your message', newurl.status_code)
def markyCall(url=""):
r = requests.get('[heckyesmarkdown.com/go/](http://heckyesmarkdown.com/go/?u=') + url)
return r._content
def process_site(call):
data_set = getDataSet(call)
processed_site = []
errors = []
for site in data_set['posts']:
try:
url = checkURL(site['href'])
except ValueError:
errors.append(site['href'])
description = markyCall(url)
site['extended'] = description
processed_site.append(site)
print errors
return processed_site
def write_pinboard(site, auth_token):
stem = 'https://api.pinboard.in/v1/posts/add?format=json&auth_token='
payload = {}
payload['url'] = site.get('href')
payload['description'] = site.get('description', '')
payload['extended'] = site.get('extended', '')
payload['tags'] = site.get('tags', '')
payload['shared'] = site.get('extended', 'no')
payload['toread'] = site.get('toread', 'no')
r = requests.get(stem + auth_token, params = payload)
print(site['href'] + '\t\t' + r.status_code)
def main():
settings = file('AUTH.yaml', 'rw')
identity = yaml.load(AUTH.yaml)
auth_token = identity['user_name'] + ':' + identity['token']
valid_sites = process_site('?format=json&auth_token=' + auth_token)
for site in valid_sites:
write_pinboard(site, auth_token)
if __name__ == '__main__':
main()
|