Welcome to HBH! If you have tried to register and didn't get a verification email, please using the following link to resend the verification email.

Downloading from a https webserver


ghost's Avatar
0 0

I am attempting to download a zip file automatically that has data on potatos and the like from a https webserver, I have a username and password, but I want to be able to download and save automatically, since this program will be running as a service. The site does not use cookies to authenticate and it does not have port 21 open, the only port open on the server is 443. My current solution is to use a third party macro tool that will interact with my program to click the save button and then type in a file name to save it as.

I am using MS C# 2010 and have been using the webbrowser control, simply because I do not know another method. I have tried going to the download page without authenticating using the webclient class, and it just redirects me to the default.aspx page.I have tried using a webclient class and using the network credentials property to authenticate before downloading, but still no luck. Any ideas?

Thanks,

wired_al


spyware's Avatar
Banned
0 0

cron + wget in linux, scheduler + wget in windows.


ghost's Avatar
0 0

The only problem is something I should have probably stated earlier… I do not have access to the location of the file in a way that I know how to get to it… it's all done through a combination of javascript and ASPX. I just got the source from the downall page… which I will post. Basically, I know they are running IIS, wget wont work, because it does not like unauthenticated download attempts…

Here is the code…


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
<head><title>
	Download All Files
</title></head>
<body>
    <form name="form1" method="post" action="downloadAll.aspx?userName=xxxxxxxxxx" id="form1">
<div>
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/nothing/important"> <!--note, edited to protect clients data -->
</div>

    <div>
    <span id="thisLabel"></span>
    </div>
    </form>

<script language='javascript'>alert( "Could not find file 'C:\inetpub\TempUser\xxxxxxxxxxFFVFiles.zip'." );</script>
</body>
</html>

It is a state run government website, and we are trying to talk to the admin, he won't even reply… government workers… Also, this service is not only a download service, it also extracts the data from the xml files which are in the zip file, then parses the xml files, stores them in the on site SQL database, and then pulls reports. All of this is working except for the download…


stealth-'s Avatar
Ninja Extreme
0 0

wired_al wrote: wget wont work, because it does not like unauthenticated download attempts…

Wget can handle authentication.

wget –user=fbi –password=s3cr3t http://site.gov/downloadAll.aspx?userName=xxxxxxxxxx

Something like that? It's a little hard to tell considering you erased over half the HTML.


ghost's Avatar
0 0

stealth- wrote: [quote]wired_al wrote: wget wont work, because it does not like unauthenticated download attempts…

Wget can handle authentication.

wget –user=fbi –password=s3cr3t http://site.gov/downloadAll.aspx?userName=xxxxxxxxxx

Something like that? It's a little hard to tell considering you erased over half the HTML.[/quote]

Sadly, I did not erase the html… I changed the username, and I erased one hidden field that had some base64 encryped data in it. I decoded it and it is not relevant…

I just tried the code for wget that you posted (except of course I changed the username password and url) and it gave me the source to the main page because it didn't like my authentication…

Thanks -wired_al


stealth-'s Avatar
Ninja Extreme
0 0
  1. How does this server authenticate with the user again?
  2. You remembered it's https://, right?
  3. The HTML code you posted is not a functioning page, so obviously something is wrong.

ghost's Avatar
0 0

stealth- wrote:

  1. How does this server authenticate with the user again?
  2. You remembered it's https://, right?
  3. The HTML code you posted is not a functioning page, so obviously something is wrong.
  1. A username and password prompt at the default.aspx (cookies are not used, and I cannot find a session ID either…)
  2. yes
  3. Probably because it is a .aspx page that is never actually seen. I had to type view-source: in firefox in front of the address. This is what happens when you click the download all link. It brings up a popup window which is the downAll.aspx?userName=xxxxxxxxxx , and then starts the download of the zip file and closes the little popup. This page is never ment to be seen by the user I don't think. (note: aspx code is run at the server. kind of like php, only the output is sent to the browser) When I click the date link to choose what data I want, this function is executed
function showFiles(user, fileDate){
   //setValue("")
   document.body.style.cursor = 'wait';
   PageMethods.getFilesByDate(user, fileDate, OnSucess, OnFail);
   var url = "javascript:downAll('" + user + "');";
   $get('ctl00_centerBox_downAllFiles').innerHTML = "<a class='a' href=" + url + ">Download All Files (zip file)</a>";
}

The user is my username, and the fileData is the date I want data for. When I click the download all button, this code is executed.

function downAll(user){
      window.open("downloadAll.aspx?userName=" + user,"", "height=2,width=2");
   }

which results in the code I posted earlier.

When I tell it to only download one file, this is the funtion that is executed

function checkForExpiredSession(linkURL) {
    //setValue("")
    PageMethods.checkForExpiredSession(linkURL, showUserFile, OnFail);
}

<!--note, this is the PageMethods.checkForExpiredSession prototype, cannot seem to find the code -->
PageMethods.checkForExpiredSession= function(linkURL,onSuccess,onFailed,userContext) 

And by the way, the login page is the same as this page, and the code that is delevered to the browser does not change if I am logged in or not.

Thanks (yet again) -wired_al


ghost's Avatar
0 0

Oddly enough, it i was this line of code that helped me find my answer last night at around 1:00 or so…

<script language='javascript'>alert( "Could not find file 'C:\inetpub\TempUser\xxxxxxxxxxFFVFiles.zip'." );</script>

I figured I couldn't access the TempUser directory since the default directory for webhosting is c:\inetpub\wwwroot, but thought at this point, anything was worth a shot. So after clicking the link for the date, I went to https://website.gov/TempUser/xxxxxxxxxxFFVFiles.zip and it began to download. I figured that I would let the WebBrowser control authenticate automatically, click the links, and then I would use the WebClient class to download like this.

//after authenticating and clicking link for todays date...
using (WebClient wc = new WebClient())
{
    wc.DownloadFile     ("https://website.gov/TempUser/xxxxxxxxxxFFVFiles.zip", "c:\\temp_data\\"+currentdate+".zip");
}

It works, the reason I have to click the link for the date I want first is because the .zip files are non-existant until I click the link and the javascript function runs, then the .zip file is created and I can download it.

Thanks for your help -wired_al


stealth-'s Avatar
Ninja Extreme
0 0

Glad you got it working.

Out of curiosity, what exactly is all this for…?


ghost's Avatar
0 0

stealth- wrote: Glad you got it working.

Out of curiosity, what exactly is all this for…?

The people that have hired me buy potatoes from the farms, then they sell them to people who sell potatoes. Up until now, they have been told how much money they should be paid by the people that they sell potatoes to because they are too lazy to look through the xml files that have all the weights and types of potatoes and such. I heard about this, talked to them and my program downloads the data, parses it, dumps it into a sql database, then generates reports and delivers it to the president and owners of the company. They can then tell how much money they should be getting paid.

If you are from my state, you probably know it. What other state would have a state run website dedicated to the movement of potatoes?