Puppeteer and JavaScript help requested

Einklappen
X
 
  • Zeit
  • Anzeigen
Alles löschen
neue Beiträge
  • Tico
    Lox Guru
    • 31.08.2016
    • 1035

    #1

    Puppeteer and JavaScript help requested

    I've successfully set up Puppeteer on a Raspberry Pi to scrape data from my electricity account. The plan is to use this data in the Miniserver.

    I have no JavaScript experience and have hobbled together a script from various examples that gets close to what I need. I don't have the know-how to fine-tune the last details.

    The script -

    1. Logs into the account with Username and Password,
    2. Finds the text relevant to the search term,
    3. Saves the text to a file,
    4. Saves a screenshot of the web-page to a file,
    4. Prints the text to the console,
    5. Sends text via UDP to the Miniserver.

    Here is the script -
    Code:
    // Synergy Unbilled Usage
    
    var dgram = require('dgram');
    var client = dgram.createSocket('udp4');
    const puppeteer = require('puppeteer-core');
    
    (async ()=> {
        const browser = await puppeteer.launch({product: 'chrome', executablePath: '/usr/bin/chromium-browser', userDataDir: '/home/data', headless: true});
        const page = await browser.newPage();
        await page.goto('https://selfserve.synergy.net.au/my-account.html');
        await page.type('#loginForm-username', 'My_Username');
        await page.type('#loginForm-password', 'My_Password');
        await page.click("button[type=submit]");
    
        await page.waitFor(30000);
        const price = await page.evaluate(() => {
            return Array.from(document.querySelectorAll(".payment-value.ng-binding")).map(x => x.textContent);
            });
        await page.screenshot({path: 'Synergy_Login.png'});
        
        const fs = require('fs');
        fs.writeFileSync('/home/file.txt', price[0]);
            
        console.log(20, price[0]);
        
        client.send('Hello World!',0, 12, 7001, '10.1.1.3');
                    
        await browser.close();
        
        client.close();
        
    })();​

    Unfortunately, the text I see is the wrong text element in the web-page. The search term picks up an earlier text element because there are many instances of the same search term.

    The web-scraping script uses the search term as follows (where ".ng-binding" appears immediately before the text I require) -

    Code:
    const price = await page.evaluate(() => {
            return Array.from(document.querySelectorAll(".ng-binding")).map(x => x.textContent);
            });​

    The web-page has the following code around the field I wish to scrape -

    Code:
    </div>
        <!-- ngInclude: 'app/routes/user/account/dashboard/dashboard-usage.html' --><div ng-include="'app/routes/user/account/dashboard/dashboard-usage.html'" class="ng-scope"><div ng-controller="DashboardUsageCtrl as dashboardUsageCtrl" class="ng-scope">
    
        <!-- ngIf: accountCtrl.isCollective() -->
    
        <!-- ngIf: accountCtrl.isAmiCustomer() && !accountCtrl.isCollective() --><div class="account-history panel ami-account-summary ng-scope col-md-6" ng-if="accountCtrl.isAmiCustomer() &amp;&amp; !accountCtrl.isCollective()" ng-class="{true:'col-md-6',false:'col-md-12'}[dashboardUsageCtrl.isResidential]">
            <div class="panel-header">
                <div class="panel-header-container">
                    <h2 class="pull-left panel-title">
                        Energy usage
                    </h2>
                    <a href="javascript:void(0);" title="Energy usage tooltip" ng-click="dashboardUsageCtrl.amiEnergyToolTip()" class="pull-right title-info-icon" data-event="site-interaction" data-location="body" data-description="Energy usage tooltip" data-type="link">
                        <span class="sy-icon--circle_info"></span>
                    </a>
                </div>
            </div>
            <div class="panel-body">
                <!-- ngInclude: 'app/routes/user/account/dashboard/ami/ami-acccount-summary-chart.html' --><div ng-include="'app/routes/user/account/dashboard/ami/ami-acccount-summary-chart.html'" class="ng-scope"><div class="usage-bar-chart ng-scope" ng-controller="AmiAcSummaryCtrl as amiAcSummaryCtrl">
    
        <!-- ngIf: amiAcSummaryCtrl.showChart && amiAcSummaryCtrl.isActiveAcWithUnbilledAmounts() --><div ng-if="amiAcSummaryCtrl.showChart &amp;&amp; amiAcSummaryCtrl.isActiveAcWithUnbilledAmounts()" class="ng-scope" style="">
            <div class="content">
                <p class="text"><b class="ng-binding">Total unbilled tariff charge is $0.00</b></p>
                <a title="Link to supply charges and GST" sy-doc-href="synergy.pricecharges.brochure" target="_blank" data-event="site-interaction" data-location="body" data-description="Link to supply charges and GST" data-type="hyperlink" href="https://www.synergy.net.au/-/media/Files/PDF-Library/Standard_Electricity_Prices_Charges_brochure.pdf?desktop=true">
                    Includes supply charges and GST
                </a>
                <p><b>*Prices are subject to change</b></p>
            </div>​

    The text I wish to send to the Miniserver is "Total unbilled tariff charge is $0.00". How can the search term be constrained to target just that text element?

    The text that is sent via UDP to the Miniserver is currently "Hello World!". I want to send the variable Price[0], where Price[0] should reflect the text "Total unbilled tariff charge is $0.00".
    Zuletzt geändert von Tico; 07.03.2023, 06:32.
    Ich spreche kein Deutsch. Gib Google Translate die Schuld, wenn ich unverständlich bin.
  • Tico
    Lox Guru
    • 31.08.2016
    • 1035

    #2
    Some helpful advice was received over at StackOverflow. To close the loop on this, here's the finished script for anyone who wants to modify it for their own needs. The script -

    1. Logs into a secure website with username/password,
    2. Takes a web-page screenshot (useful for debugging),
    3. Saves the desired text element to a file,
    4. Prints the desired text element to the terminal (useful for debugging),
    5. Sends the text element via UDP to the Miniserver.

    Code:
    // Synergy Unbilled Tariff Metric using Puppeteer
    
    var dgram = require('dgram');
    var client = dgram.createSocket('udp4');
    const puppeteer = require('puppeteer-core');
    
    (async ()=> {
    const browser = await puppeteer.launch({product: 'chrome', executablePath: '/usr/bin/chromium-browser', userDataDir: '/home/synergy_web_scraper/data', headless: true});
    const page = await browser.newPage();
    await page.goto('https://selfserve.synergy.net.au/my-account.html');
    await page.type('#loginForm-username', 'My_Username');
    await page.type('#loginForm-password', 'My_Password');
    await page.click("button[type=submit]");
    
    await page.waitFor(30000);
    const price = await page.evaluate(() => {
    return Array.from(document.querySelectorAll(".content .ng-binding")).map(x => x.textContent);
    });
    await page.screenshot({path: 'Synergy_Login.png'});
    
    const fs = require('fs');
    fs.writeFileSync('/home/synergy_web_scraper/file.txt', price[0]);
    
    console.log(price[0]);
    client.send(price[0], 7001, '10.1.1.3');
    await browser.close();
    client.close();
    
    })();​
    Ich spreche kein Deutsch. Gib Google Translate die Schuld, wenn ich unverständlich bin.

    Kommentar

    Lädt...