Automating OLDaily for LinkedIn
LinkedIn can be a pain to work with. I have long wanted to publish OLDaily, my daily newsletter, on it but the API doesn't support this.
This week I came up with a new plan: use browser automation software running on my desktop at home to download my newsletter XML, format it, and submit it as a new LinkedIn newsletter issue on my LinkedIn account. I turned to ChatGPT for help and after a couple days of iteration and testing, made it work (it would have taken me a long time to do this without that support).
Obviously, I needed both the OLDaily XML file and my newsletter on LinkedIn set up to make this work . Here they both are:
- Newsletter: https://www.linkedin.com/newsletters/oldaily-7369381037719646208/
- OLDaily XML: https://www.downes.ca/news/OLDaily.xml
I set up a single directory on my Windows 11 computer at home for the scripts (I was just going to run it off the server but it gets a bit complicated doing it off an SSH command line). There are three major files:
.env
li_newsletter_selenium.py
run_oldaily.ps1
The first file defines things like the URL and passwords (if you were doing this yourself you'd set your own values here):
LINKEDIN_EMAIL=s***a
LINKEDIN_PASSWORD=***
NEWSLETTER_NAME=OLDaily # must match exactly in LinkedIn’s modal
FEED_XML_URL=https://www.downes.ca/news/OLDaily.xml
MAX_ISSUES_PER_RUN=1 # 1 is safest; raise if you want
HEADLESS=false # set true for silent runs
TITLE_PREFIX=OLDaily -
TITLE_DATE_FORMAT=%Y-%m-%d
TIMEZONE=America/Toronto
# optional:
# EDITION_DATE=2025-09-05
PROFILE_DIR=E:\Websites\downes\chrome_profile
COMPOSER_URL=https://www.linkedin.com/article/new/author=urn%3Ali%3Afsd_profile%3AACoAAAAI52YBB6qnG3mdwHncS6-Lx5nnkx5Rz8I
Get the 'composer URL' from LinkedIn by creating a newsletter,. then creating an article for that newsletter.
I had to import a number of libraries for the Python script, including especially Selenium. So I updated my Python installation and created the project directory ( E:\Websites\downes ) then created a python environment (I've always hated Python environments, which is why I spent so many years as a Perl coder):
(In PowerShell)
python -m venv .\venv
.\venv\Scripts\Activate.ps1
Then I imported my dependencies (In PowerShell):
python -m pip install -U pip setuptools wheel
pip install selenium feedparser beautifulsoup4 python-dotenv tzdata requests
Then I put my Python script into the project directory (this took a *lot* of iteration to get right):
# li_newsletter_selenium.py
# Create ONE LinkedIn newsletter issue from a single XML page.
# Title: "OLDaily - <today's date>"
# Body: each <item> becomes a paragraph: <p><a href="LINK">TITLE</a> DESCRIPTION</p>
import os, json, time, sys, html
from pathlib import Path
from dataclasses import dataclass
from datetime import datetime
from zoneinfo import ZoneInfo, ZoneInfoNotFoundError
import feedparser
from bs4 import BeautifulSoup
from dotenv import load_dotenv
from urllib.parse import urljoin
# --- Selenium (standard) ---
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# ===================== CONFIG & ENV =====================
load_dotenv()
# LinkedIn auth / target
LINKEDIN_EMAIL = os.getenv("LINKEDIN_EMAIL") # you@example.com (optional; you can log in manually)
LINKEDIN_PASSWORD = os.getenv("LINKEDIN_PASSWORD") # (optional)
NEWSLETTER_NAME = os.getenv("NEWSLETTER_NAME") # must match exactly in LinkedIn UI
# Source XML page (the whole page is one edition)
FEED_XML_URL = os.getenv("FEED_XML_URL") # e.g. https://example.com/olddaily.xml
# Title settings
TITLE_PREFIX = os.getenv("TITLE_PREFIX", "OLDaily - ")
TITLE_DATE_FORMAT = os.getenv("TITLE_DATE_FORMAT", "%Y-%m-%d") # change to "%B %d, %Y" if you prefer
TIMEZONE = os.getenv("TIMEZONE", "America/Toronto")
# Optional: override edition date (YYYY-MM-DD); otherwise "today" in TIMEZONE
EDITION_DATE = os.getenv("EDITION_DATE", "").strip()
# Chrome profile & headless
HEADLESS = os.getenv("HEADLESS", "false").lower() == "true"
PROFILE_DIR = str(Path(os.getenv("PROFILE_DIR", r"E:\Websites\downes\chrome_profile")).resolve())
# LinkedIn Article composer URL (your author URN embedded)
COMPOSER_URL = os.getenv(
"COMPOSER_URL",
"https://www.linkedin.com/article/new/?author=urn%3Ali%3Afsd_profile%3AACoAAAAI52YBB6qnG3mdwHncS6-Lx5nnkx5Rz8I"
)
POSTED_PATH = Path("posted.json")
# Sanity checks
assert NEWSLETTER_NAME, "Set NEWSLETTER_NAME in .env"
assert FEED_XML_URL, "Set FEED_XML_URL in .env"
# ===================== HELPERS =====================
def sanitize_keep_links(html_in: str, base_url: str) -> str:
"""
Keep only <a> and <br>. For <a>, keep a safe absolute href.
Strip all other tags (but keep their text).
"""
soup = BeautifulSoup(html_in or "", "html.parser")
for tag in soup.find_all(True):
if tag.name == "a":
href = tag.get("href") or ""
if href:
href = urljoin(base_url or "", href)
if href.startswith(("http://", "https://", "mailto:", "tel:")):
tag.attrs = {"href": href}
else:
tag.unwrap()
else:
tag.unwrap()
elif tag.name == "br":
# keep line breaks
continue
else:
tag.unwrap()
return (str(soup) or "").strip()
def _find_headline_element(drv):
"""Return the best guess for the headline element (WebElement) or None."""
# Inputs / textareas that look like title/headline
input_like = [
"//input[( @placeholder or @aria-label ) and (contains(translate(@placeholder,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'headline') or contains(translate(@placeholder,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'title') or contains(translate(@aria-label,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'headline') or contains(translate(@aria-label,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'title'))]",
"//textarea[( @placeholder or @aria-label ) and (contains(translate(@placeholder,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'headline') or contains(translate(@placeholder,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'title') or contains(translate(@aria-label,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'headline') or contains(translate(@aria-label,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'title'))]",
]
# Contenteditable candidates
editable_like = [
"//div[@contenteditable='true' and (@data-placeholder='Add a headline' or @data-placeholder='Add headline' or contains(translate(@data-placeholder,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'title'))]",
"//h1[@contenteditable='true']",
"//div[@role='textbox' and @contenteditable='true' and (contains(translate(@aria-label,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'headline') or contains(translate(@aria-label,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'),'title'))]",
"//header//*[(@contenteditable='true') or self::h1[@contenteditable='true']]",
"(//div[@contenteditable='true'])[1]"
]
for xp in input_like + editable_like:
try:
el = WebDriverWait(drv, 10).until(EC.element_to_be_clickable((By.XPATH, xp)))
return el
except:
pass
return None
def _get_textlike_value(drv, el):
"""Read current value/text from input/textarea/contenteditable via JS."""
return drv.execute_script("""
const el = arguments[0];
const tn = el.tagName.toLowerCase();
if (tn === 'input' || tn === 'textarea') return el.value || '';
if (el.getAttribute('contenteditable') === 'true') return el.innerText || el.textContent || '';
return '';
""", el) or ""
def _set_via_exec_command(drv, el, text):
"""Use execCommand insertText which triggers input events in most editors."""
return drv.execute_script("""
const el = arguments[0];
const text = arguments[1];
el.focus();
try { document.execCommand('selectAll', false, null); document.execCommand('delete', false, null); } catch(e){}
const ok = document.execCommand('insertText', false, text);
el.dispatchEvent(new InputEvent('input', {bubbles:true}));
el.dispatchEvent(new Event('change', {bubbles:true}));
return ok;
""", el, text)
def _set_value_and_events(drv, el, text):
"""Directly set value/textContent and fire events."""
return drv.execute_script("""
const el = arguments[0];
const text = arguments[1];
const tn = el.tagName.toLowerCase();
el.focus();
if (tn === 'input' || tn === 'textarea') {
el.value = text;
} else if (el.getAttribute('contenteditable') === 'true') {
el.textContent = text;
} else {
return false;
}
el.dispatchEvent(new InputEvent('input', {bubbles:true}));
el.dispatchEvent(new Event('change', {bubbles:true}));
el.blur();
return true;
""", el, text)
def find_clickable(drv, xps, timeout_each=10):
"""Try a list of XPaths; return the first clickable WebElement or None."""
for xp in xps:
try:
el = WebDriverWait(drv, timeout_each).until(EC.element_to_be_clickable((By.XPATH, xp)))
return el
except:
pass
return None
def ensure_modal(drv, timeout=60):
"""Wait for a modal/dialog to be present."""
try:
WebDriverWait(drv, timeout).until(EC.presence_of_element_located(
(By.XPATH, "//div[contains(@role,'dialog') or contains(@class,'artdeco-modal')]")
))
return True
except:
return False
def load_posted():
if POSTED_PATH.exists():
try:
return set(json.loads(POSTED_PATH.read_text()))
except Exception:
return set()
return set()
def save_posted(s):
POSTED_PATH.write_text(json.dumps(sorted(list(s)), indent=2))
def debug_dump(drv, stem="debug"):
try:
png = f"{stem}.png"
html_path = f"{stem}.html"
drv.save_screenshot(png)
Path(html_path).write_text(drv.page_source, encoding="utf-8", errors="ignore")
print(f"[debug] Saved {png} and {html_path}")
except Exception as e:
print(f"[debug] Could not save debug artifacts: {e}")
@dataclass
class NewsItem:
link: str
title: str
description_html: str # sanitized HTML with <a> preserved
def fetch_news_items_from_xml(url: str) -> list[NewsItem]:
"""Parse the XML page; each <item> becomes a NewsItem (preserving <a> links in description)."""
d = feedparser.parse(url)
items: list[NewsItem] = []
for e in d.entries:
base_url = (e.get("link") or url or "").strip()
link = (e.get("link") or "").strip()
title = (e.get("title") or "").strip()
# Prefer full content, else description/summary
desc_html_raw = ""
if getattr(e, "content", None):
try:
desc_html_raw = e.content[0].value or ""
except Exception:
desc_html_raw = ""
if not desc_html_raw:
desc_html_raw = e.get("description") or e.get("summary") or ""
desc_html = sanitize_keep_links(desc_html_raw, base_url)
if link or title or desc_html:
items.append(NewsItem(link=link, title=title, description_html=desc_html))
return items
from zoneinfo import ZoneInfo, ZoneInfoNotFoundError
def make_title() -> str:
tz = None
try:
tz = ZoneInfo(TIMEZONE)
except ZoneInfoNotFoundError:
try:
import tzdata # ensure package is present
tz = ZoneInfo(TIMEZONE)
except Exception:
tz = datetime.now().astimezone().tzinfo # fallback to local tz
if EDITION_DATE:
try:
dt = datetime.strptime(EDITION_DATE, "%Y-%m-%d")
if dt.tzinfo is None:
dt = dt.replace(tzinfo=tz)
except ValueError:
dt = datetime.now(tz)
else:
dt = datetime.now(tz)
return f"{TITLE_PREFIX}{dt.strftime(TITLE_DATE_FORMAT)}"
def build_issue_html(items: list[NewsItem]) -> str:
"""
Render each item as:
<p><a href="LINK">TITLE</a> DESCRIPTION_HTML</p>
DESCRIPTION_HTML is sanitized but preserves <a> and <br>.
"""
parts = []
for it in items:
link_attr = html.escape(it.link or "", quote=True)
title_text = html.escape(it.title or "")
desc_html = it.description_html or ""
if link_attr and title_text:
parts.append(f'<p><a href="{link_attr}">{title_text}</a> {desc_html}</p>')
elif title_text:
parts.append(f'<p><strong>{title_text}</strong> {desc_html}</p>')
elif desc_html:
parts.append(f"<p>{desc_html}</p>")
return "\n".join(parts)
# ===================== SELENIUM SETUP =====================
def make_driver():
Path(PROFILE_DIR).mkdir(parents=True, exist_ok=True)
options = Options()
options.add_argument(f"--user-data-dir={PROFILE_DIR}")
options.add_argument("--profile-directory=Default")
options.add_argument("--disable-notifications")
options.add_argument("--start-maximized")
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
if HEADLESS:
options.add_argument("--headless=new")
drv = webdriver.Chrome(options=options)
drv.set_page_load_timeout(120)
drv.implicitly_wait(2)
return drv
def wait(drv, timeout=45):
return WebDriverWait(drv, timeout)
def logged_in(drv):
try:
wait(drv, 8).until(EC.presence_of_element_located((By.CSS_SELECTOR, "input[placeholder*='Search']")))
return True
except:
return False
def ensure_login(drv):
drv.get("https://www.linkedin.com/login")
time.sleep(1.5)
if logged_in(drv):
print("[login] Already logged in.")
return
if not LINKEDIN_EMAIL or not LINKEDIN_PASSWORD:
print("[login] Waiting for manual login/2FA…")
for _ in range(150):
if logged_in(drv):
print("[login] Detected logged-in state.")
return
time.sleep(1)
print("[login] Proceeding without explicit login check.")
return
try:
email = wait(drv).until(EC.presence_of_element_located((By.ID, "username")))
pwd = drv.find_element(By.ID, "password")
email.clear(); email.send_keys(LINKEDIN_EMAIL)
pwd.clear(); pwd.send_keys(LINKEDIN_PASSWORD); pwd.send_keys(Keys.ENTER)
for _ in range(120):
if logged_in(drv):
print("[login] Success.")
return
time.sleep(1)
except Exception as e:
print(f"[login] warning: {e}")
# ---------- Composer opening (robust) ----------
def ready_state_complete(drv, timeout=60):
t0 = time.time()
while time.time() - t0 < timeout:
try:
if drv.execute_script("return document.readyState") == "complete":
return True
except:
pass
time.sleep(0.25)
return False
def click_if_visible(drv, xpaths, pause=0.25):
for xp in xpaths:
try:
el = WebDriverWait(drv, 3).until(EC.element_to_be_clickable((By.XPATH, xp)))
drv.execute_script("arguments[0].click();", el)
time.sleep(pause)
except:
pass
def editor_ready(drv):
selectors = [
"//div[@contenteditable='true' and (@data-placeholder='Add a headline' or @data-placeholder='Add headline')]",
"//h1[@contenteditable='true']",
"//div[@role='textbox' and contains(@aria-label,'headline')]",
"//div[@contenteditable='true' and contains(@aria-label,'headline')]",
"//div[@contenteditable='true' and contains(@data-placeholder,'Start writing')]",
"//div[@role='textbox' and @contenteditable='true' and not(ancestor::header)]",
"//div[@contenteditable='true' and not(ancestor::header)]",
"//*[@data-test-id[contains(.,'editor')]]//div[@contenteditable='true']",
]
for xp in selectors:
if drv.find_elements(By.XPATH, xp):
return True
return False
def try_composer_url(drv):
drv.get(COMPOSER_URL)
ready_state_complete(drv, 60)
# Dismiss banners/coachmarks
click_if_visible(drv, [
"//button[.//span[contains(.,'Accept') or contains(.,'Agree')]]",
"//button[.//span[contains(.,'Got it') or contains(.,'OK') or contains(.,'Ok')]]",
"//button[.//span[contains(.,'Skip') or contains(.,'Not now')]]",
"//button[normalize-space()='Accept']",
"//button[normalize-space()='Agree']",
"//button[normalize-space()='Got it']",
"//button[normalize-space()='Skip']",
"//button[normalize-space()='Not now']",
])
t0 = time.time()
while time.time() - t0 < 75:
if editor_ready(drv):
return True
time.sleep(0.5)
return False
def try_feed_then_click_write_article(drv):
drv.get("https://www.linkedin.com/feed/")
ready_state_complete(drv, 60)
click_if_visible(drv, [
"//button[.//span[contains(.,'Accept') or contains(.,'Agree')]]",
"//button[.//span[contains(.,'Got it') or contains(.,'OK') or contains(.,'Ok')]]",
"//button[.//span[contains(.,'Skip') or contains(.,'Not now')]]",
])
candidates = [
"//a[contains(@href,'/article/new')]",
"//a[.//span[contains(.,'Write article')]]",
"//button[.//span[contains(.,'Write article')]]",
"//div[contains(@data-test-id,'share-box')]//a[contains(@href,'/article/new')]",
]
for xp in candidates:
try:
el = wait(drv, 20).until(EC.element_to_be_clickable((By.XPATH, xp)))
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", el)
drv.execute_script("arguments[0].click();", el)
break
except:
pass
if len(drv.window_handles) > 1:
drv.switch_to.window(drv.window_handles[-1])
ready_state_complete(drv, 60)
t0 = time.time()
while time.time() - t0 < 75:
if editor_ready(drv):
return True
time.sleep(0.5)
return False
def open_composer(drv):
if try_composer_url(drv):
return
if try_feed_then_click_write_article(drv):
return
debug_dump(drv, "debug_composer")
raise TimeoutError("LinkedIn editor did not appear. See debug_composer.*")
# ---------- Editor actions ----------
def set_headline(drv, headline):
el = _find_headline_element(drv)
if not el:
debug_dump(drv, "debug_headline_not_found")
raise RuntimeError("Could not locate the headline field.")
# Scroll into view & focus
try:
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", el)
el.click()
time.sleep(0.2)
except:
pass
def ok_now():
val = (_get_textlike_value(drv, el) or "").strip()
want = (headline or "").strip()
# Normalize internal whitespace for a fair compare
val = " ".join(val.split())
want = " ".join(want.split())
return val == want
# Strategy 1: plain key events
try:
el.send_keys(Keys.CONTROL, "a"); el.send_keys(Keys.DELETE)
time.sleep(0.1)
# type in chunks to better trigger frameworks like Draft.js/Slate
for chunk in [headline[i:i+20] for i in range(0, len(headline), 20)]:
el.send_keys(chunk)
time.sleep(0.02)
if ok_now():
return
except:
pass
# Strategy 2: execCommand('insertText') (fires beforeinput/input)
try:
_set_via_exec_command(drv, el, headline)
time.sleep(0.15)
if ok_now():
return
except:
pass
# Strategy 3: set value/textContent and dispatch input/change
try:
_set_value_and_events(drv, el, headline)
time.sleep(0.15)
if ok_now():
return
except:
pass
# Final attempt: re-focus, type again with keys
try:
el.click()
time.sleep(0.1)
el.send_keys(Keys.CONTROL, "a"); el.send_keys(Keys.DELETE)
el.send_keys(headline)
time.sleep(0.15)
if ok_now():
return
except:
pass
debug_dump(drv, "debug_headline_sticky")
raise RuntimeError("Headline could not be set (framework ignored changes).")
def set_body(drv, html_body):
candidates = [
"//div[@contenteditable='true' and contains(@data-placeholder,'Start writing')]",
"//div[@role='textbox' and @contenteditable='true' and not(ancestor::header)]",
"//div[@contenteditable='true' and not(ancestor::header)]",
"(//div[@contenteditable='true'])[last()]",
]
last_err = None
for xp in candidates:
try:
body = wait(drv, 20).until(EC.element_to_be_clickable((By.XPATH, xp)))
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", body)
body.click()
drv.execute_script("""
const el = arguments[0];
const html = arguments[1];
el.focus();
try { document.execCommand('selectAll', false, null); document.execCommand('delete', false, null); } catch(e){}
const sel = window.getSelection();
if (!sel.rangeCount) {
const r = document.createRange();
r.selectNodeContents(el);
r.collapse(false);
sel.removeAllRanges();
sel.addRange(r);
}
const range = sel.getRangeAt(0);
const tmp = document.createElement('div');
tmp.innerHTML = html;
const frag = document.createDocumentFragment();
while (tmp.firstChild) frag.appendChild(tmp.firstChild);
range.deleteContents();
range.insertNode(frag);
""", body, html_body)
time.sleep(0.8)
return
except Exception as e:
last_err = e
debug_dump(drv, "debug_body")
raise RuntimeError(f"Could not set body content: {last_err}")
def click_next(drv):
"""Click the pre-publish 'Next' step if LinkedIn shows it. If it's not there, do nothing."""
next_selectors = [
"//button[.//span[normalize-space()='Next']]",
"//button[normalize-space()='Next']",
"//button[contains(@aria-label,'Next')]",
"//button[contains(., 'Next')]", # fallback (broader)
"//div[@role='dialog']//button[.//span[normalize-space()='Next']]", # if inside dialog
]
btn = find_clickable(drv, next_selectors, timeout_each=5)
if btn:
try:
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", btn)
time.sleep(0.2)
drv.execute_script("arguments[0].click();", btn)
print("[next] Clicked.")
# Give the UI a beat to transition
time.sleep(0.8)
except Exception as e:
print(f"[next] warning: {e}")
else:
print("[next] No 'Next' button visible; continuing.")
def select_newsletter_and_publish(drv, subtitle_text):
"""
Handles BOTH flows:
A) If there's a 'Publish' button on the page, click it to open the modal.
B) If we're already in a modal (after 'Next'), just proceed.
Then: ensure 'Newsletter' destination, choose the target newsletter, fill subtitle, and click Publish.
"""
def click_publish_button_on_page():
publish_selectors = [
"//button[.//span[normalize-space()='Publish']]",
"//button[normalize-space()='Publish']",
"//*[@data-test-id[contains(.,'publish')]]",
# Some variants put Publish in a sticky header bar
"//header//button[.//span[normalize-space()='Publish']]",
]
btn = find_clickable(drv, publish_selectors, timeout_each=5)
if btn:
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", btn)
time.sleep(0.2)
drv.execute_script("arguments[0].click();", btn)
print("[publish] Primary clicked (to open modal).")
return True
return False
# 1) If no modal yet, try to open it via a Publish button on the page
modal_present = ensure_modal(drv, timeout=4)
if not modal_present:
opened = click_publish_button_on_page()
if opened:
modal_present = ensure_modal(drv, timeout=20)
if not modal_present:
# Last-chance: sometimes 'Next' immediately shows modal content; brief wait:
modal_present = ensure_modal(drv, timeout=10)
if not modal_present:
debug_dump(drv, "debug_publish_open")
raise RuntimeError("Could not open the publish dialog (modal not found).")
# 2) Inside the modal, prefer 'Newsletter' destination if shown
try:
# Radio / tab labeled "Newsletter"
click_if_visible(drv, [
"//label[.//span[contains(.,'Newsletter')]]/preceding-sibling::input[@type='radio']",
"//button[.//span[contains(.,'Newsletter')]]",
"//*[contains(@role,'tab') and .//span[contains(.,'Newsletter')]]",
], pause=0.2)
except:
pass
# 3) Choose your newsletter by name (works whether it's a list or dropdown)
picked = False
# (a) Directly clickable label/list item
try:
cand = drv.find_elements(By.XPATH,
f"//span[normalize-space()='{NEWSLETTER_NAME}']/ancestor::*[(self::label or self::button or self::div or self::li)][1]"
)
if cand:
drv.execute_script("arguments[0].click();", cand[0])
picked = True
except:
pass
# (b) Open dropdown/combobox and pick from menu
if not picked:
try:
# Try to open any newsletter picker
click_if_visible(drv, [
"//*[@role='combobox']",
"//button[contains(@id,'newsletter') and contains(@aria-expanded,'false')]",
"//button[.//span[contains(.,'Select') and contains(.,'newsletter')]]",
], pause=0.3)
time.sleep(0.4)
# Click the item by name in the menu/listbox
opt = find_clickable(drv, [
f"//div[@role='listbox']//div[normalize-space()='{NEWSLETTER_NAME}']",
f"//ul[contains(@role,'listbox')]//li[.//span[normalize-space()='{NEWSLETTER_NAME}']]",
f"//*[self::div or self::span or self::li][normalize-space()='{NEWSLETTER_NAME}']",
], timeout_each=5)
if opt:
drv.execute_script("arguments[0].click();", opt)
picked = True
except:
pass
if not picked:
print("[publish] Newsletter picker not visible or already selected; continuing.")
# 4) Subtitle/description (optional field in modal)
try:
sub = drv.find_element(By.XPATH, "//textarea | //div[@role='textbox' and @contenteditable='true']")
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", sub)
sub.click()
for _ in range(3):
sub.send_keys(Keys.CONTROL, "a"); sub.send_keys(Keys.DELETE)
sub.send_keys(subtitle_text[:250])
except:
pass
# 5) Final 'Publish' inside the modal
confirm_selectors = [
"//div[contains(@role,'dialog') or contains(@class,'artdeco-modal')]//button[.//span[normalize-space()='Publish']]",
"//div[contains(@role,'dialog') or contains(@class,'artdeco-modal')]//button[normalize-space()='Publish']",
"//div[contains(@role,'dialog') or contains(@class,'artdeco-modal')]//button[.//span[contains(.,'Publish now')]]",
"//div[contains(@role,'dialog') or contains(@class,'artdeco-modal')]//button[.//span[normalize-space()='Post']]", # rare variant
"//button[@data-test-id='confirmPublish']",
]
btn = find_clickable(drv, confirm_selectors, timeout_each=15)
if not btn:
debug_dump(drv, "debug_publish_confirm")
raise RuntimeError("Final Publish confirm not found.")
drv.execute_script("arguments[0].scrollIntoView({block:'center'});", btn)
time.sleep(0.2)
drv.execute_script("arguments[0].click();", btn)
print("[publish] Confirmed.")
# ===================== MAIN FLOW =====================
def main():
# Build content from XML
print("[build] Fetching XML…")
items = fetch_news_items_from_xml(FEED_XML_URL)
if not items:
print("[build] No <item> elements found in XML. Aborting.")
sys.exit(1)
title = make_title()
print(f"[build] Title: {title}")
body_html = build_issue_html(items)
# Duplicate guard by edition title (date-based)
posted = load_posted()
unique_key = f"issue:{title}"
if unique_key in posted:
print(f"[guard] Already posted today: {title}")
return
# Start browser
drv = make_driver()
try:
ensure_login(drv)
# Open editor
open_composer(drv)
print(f"[editor] Ready. Setting headline/body…")
set_headline(drv, title)
set_body(drv, body_html)
click_next(drv) # clicks 'Next' if present (otherwise harmless)
# If no modal within ~2s, re-check the headline and try Next again once.
if not ensure_modal(drv, timeout=2):
try:
el = _find_headline_element(drv)
if el and _get_textlike_value(drv, el).strip():
click_next(drv)
except:
pass
subtitle = f"Summary for {title}"
select_newsletter_and_publish(drv, subtitle) # opens modal (if needed) and clicks Publish
try:
WebDriverWait(drv, 45).until(EC.presence_of_element_located(
(By.XPATH, "//div[contains(.,'Published') or contains(.,'published')] | //a[contains(.,'View') and contains(.,'post')]")
))
except:
pass
print("[done] Issue published.")
posted.add(unique_key)
save_posted(posted)
except Exception as e:
print(f"[fatal] {e}")
debug_dump(drv, "debug_fatal")
raise
finally:
if HEADLESS:
drv.quit()
if __name__ == "__main__":
main()
(Here's a downladable link to the script so you don't have to mess with copy and paste):
I also created a directory for my Chrome profile: E:\Websites\downes\chrome_profile
Then I ran the script manually foer the first time, in order to log in and create the profile (this is useful if there's a capcha or 2FA or something). From the project directory in PowerShelll:
python .\li_newsletter_selenium.py
This will open Chrome and allow you to log in if you need to (I didn't need to; it just used my .env values and went straight in).
If there are errors the script will output screenshots and error reports in the project directory. Here's one:
Heh.
To automate the newsletter I used the built in task scheduler for WSindows (on Linux I would just use cron, but there's nothing so simple in Windows).
Here's the script ()run_olddaily.ps1) to run:
Set-Location E:\Websites\downes
& .\venv\Scripts\Activate.ps1
$env:HEADLESS="true" # run Chrome headless
$env:TITLE_DATE_FORMAT="%Y-%m-%d" # or "%B %d, %Y" for "September 5, 2025"
python .\li_newsletter_selenium.py *>> .\run.log
Then set up the task scheduler as follows:
Action: Program/script:
powershell.exe
Arguments: -ExecutionPolicy Bypass -File "E:\Websites\downes\run_olddaily.ps1"
Start in: E:\Websites\downes
Trigger: Daily at your preferred time (since I only publish weekdays, I selected 'weekly' and then pick the specific days)
Options: Run whether user is logged on or not; configure for Windows 10/11.
This should work but I haven't run it yet (the script won't run a second time on a given day).
That's it!
Comments
Post a Comment
Your comments will be moderated. Sorry, but it's not a nice world out there.