Wikipedia:Reference desk/Archives/Computing/2014 April 9
Computing desk | ||
---|---|---|
< April 8 | << Mar | April | May >> | April 10 > |
Welcome to the Wikipedia Computing Reference Desk Archives |
---|
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
April 9
[edit]Controlling the amount of time a piece of code executes itself
[edit]I'm fairly new to Python, which is why I've been going around asking all of these questions. I've cooked up a bit of code that streams tweets from Twitter in real time. So once I run the following code, the tweets keep streaming in and saving themselves to a text file on my desktop. The thing is, I haven't figured out a way to end the script yet, and I resort to closing the entire Python window to stop streaming. I'd like to put this into a function that I can call, but that would be cumbersome if there is no way to keep the script from running infinitely. I could either specify the number of tweets after which to stop, or the time elapsed after which the function stops. Please tell me how I can set a small timer that stop the execution of the function after, say, 5 minutes? How do you even create time variables in Python? And what is the format for specifying time?
# -*- coding: utf-8 -*-
import json
import time
import tweepy
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
ckey = '<twitter_key1>'
csecret = '<twitter_key1>'
atoken = '<twitter_key1>'
asecret = '<twitter_key1>'
class listener(StreamListener):
def on_data(self, data):
try:
data = json.loads(data)
tweet = data['text']
print tweet
with open("C://Users/La Alquimista/Desktop/hindistream.txt", "a") as f:
f.write(tweet.encode("UTF-8"))
f.write('\n\n')
return True
except BaseException, e:
print 'failed ondata,',str(e)
time.sleep(5)
def on_error(self,status):
print status
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track=["को".decode('utf-8', 'ignore')])
La Alquimista 09:30, 9 April 2014 (UTC)
- Record the time at the start, and then periodically check to see if your specified interval has elapsed. This is easier than using threads or timers; it has the disadvantage that it won't interrupt the script if it gets stuck somewhere (e.g. in a really slow connection to Twitter). This example repeatedly polls for user input, and quits after 30 seconds - but it won't interrupt the user inputting (that sounds like what you want too - you won't want to interrupt processing a specific tweet, you just want to quit gracefully some reasonable time after your time limit has expired).
import time
start_time = time.time()
while True:
raw_input('>')
if (time.time() - start_time) > 30.0: # are we more than 30 seconds from when we started
break # you might want to do a sys.exit() here
The simplest way is with the signal.alarm() function. You set an alarm to raise a signal after 5 minutes, and register a signal handler that gracefully shuts the program down. There is an example in the doc page. It's kind of inflexible because it means your stream reader has to run in the program's main thread and you have to handle the exception there, but doing it the "right" way (involving a timer in a separate thread kicking an asynchronous exception into the streaming thread) is a big pain that you really don't want to deal with right now. 70.36.142.114 (talk) 20:30, 9 April 2014 (UTC)
- signal.alarm() is available only on Unix(alikes); La Alquimista is developing on Windows. -- Finlay McWalterᚠTalk 21:12, 9 April 2014 (UTC)
- Oh ok, I didn't see the Windows mention and didn't realize anyone still used Windows. In this case the easiest thing is just check time.time() in the ondata method, and if it's more than 300 sec from the start time, exit the program. This means it can take longer than 5 minutes if there's no tweets for a while, or if there's a network hang. The stream protocol seems badly designed in that regard. A brutal method would be to launch a thread that sleeps 5 minutes then os.kill() the process. Or it might possible to send some other signal that the main thread can catch. 70.36.142.114 (talk) 21:45, 9 April 2014 (UTC)
- Instead of killing the process from the sleeper thread (which won't execute with statement finalizers, etc.), it would probably be better to call thread.interrupt_main(). But I strongly advise against using either of these approaches (sleeper thread or signal), because of the risk of deadlock. Asynchronous exceptions are a rat's nest. Polling (of time.time() in this case) is the only safe way to interrupt a thread. -- BenRG (talk) 00:19, 10 April 2014 (UTC)
- Yes, the issue is that I don't see a reliable way to poll with a bounded interval, because the stream operation can go off and twiddle its thumbs for an indeterminate amount of time. I don't even see any mention of the Stream interface in the tweepy docs. Doing precise timeouts around synchronous socket operations in Python is quite a hassle and I remember battling with the issue from some time back. Yes, killing the process from the sleeper thread is extremely ungraceful, so its suitability depends on what cleanup you need. You could go whole hog and put the stream reader in a separate process (rather than thread) and kill that, if that's not too painful in Windows. If this is just for personal use and you're willing to put up with a bit of imprecision, then just checking the timer in ondata seems easiest. 70.36.142.114 (talk) 05:23, 10 April 2014 (UTC)
- Calling sock.shutdown(SHUT_RDWR) in the sleeper thread might work. I just tried it in Python 3 on Windows 7 and it unblocks sock.recv() as though the remote had cleanly disconnected. I don't know how portable that is, but it may be preferable to the other hacks in this specific circumstance. Getting the socket object could be a hassle, judging from this source code. You could wrap _read_loop() and extract it from resp. -- BenRG (talk) 07:59, 10 April 2014 (UTC)
- That's interesting and I don't think I tried it, but what if the problem is in socket.connect? E.g. there is a slow DNS lookup that you want to bail out from. I might experiment with this approach on Linux. If it works, it's a good technique to know. 70.36.142.114 (talk) 15:34, 10 April 2014 (UTC)
- Calling sock.shutdown(SHUT_RDWR) in the sleeper thread might work. I just tried it in Python 3 on Windows 7 and it unblocks sock.recv() as though the remote had cleanly disconnected. I don't know how portable that is, but it may be preferable to the other hacks in this specific circumstance. Getting the socket object could be a hassle, judging from this source code. You could wrap _read_loop() and extract it from resp. -- BenRG (talk) 07:59, 10 April 2014 (UTC)
- Yes, the issue is that I don't see a reliable way to poll with a bounded interval, because the stream operation can go off and twiddle its thumbs for an indeterminate amount of time. I don't even see any mention of the Stream interface in the tweepy docs. Doing precise timeouts around synchronous socket operations in Python is quite a hassle and I remember battling with the issue from some time back. Yes, killing the process from the sleeper thread is extremely ungraceful, so its suitability depends on what cleanup you need. You could go whole hog and put the stream reader in a separate process (rather than thread) and kill that, if that's not too painful in Windows. If this is just for personal use and you're willing to put up with a bit of imprecision, then just checking the timer in ondata seems easiest. 70.36.142.114 (talk) 05:23, 10 April 2014 (UTC)
- Instead of killing the process from the sleeper thread (which won't execute with statement finalizers, etc.), it would probably be better to call thread.interrupt_main(). But I strongly advise against using either of these approaches (sleeper thread or signal), because of the risk of deadlock. Asynchronous exceptions are a rat's nest. Polling (of time.time() in this case) is the only safe way to interrupt a thread. -- BenRG (talk) 00:19, 10 April 2014 (UTC)
- Oh ok, I didn't see the Windows mention and didn't realize anyone still used Windows. In this case the easiest thing is just check time.time() in the ondata method, and if it's more than 300 sec from the start time, exit the program. This means it can take longer than 5 minutes if there's no tweets for a while, or if there's a network hang. The stream protocol seems badly designed in that regard. A brutal method would be to launch a thread that sleeps 5 minutes then os.kill() the process. Or it might possible to send some other signal that the main thread can catch. 70.36.142.114 (talk) 21:45, 9 April 2014 (UTC)
- Thank you for your answers everyone. I don't need precision right now, and crude hacks work fine as long as they get the work done. I didn't want to poll for user input, so I just created a function with the entire streaming code inside, which accepts a time value and runs the script for that duration. Works fine till now. I don't require the code to run for more than 3 or 4 minutes at a stretch anyway. La Alquimista 06:16, 10 April 2014 (UTC)
- Wait, how does your function run the script "for that duration"? I thought that was the whole difficult task here. 70.36.142.114 (talk) 15:34, 10 April 2014 (UTC)
Network Anonymity
[edit]Theres an office policy in place that restricts visits to certain websites, like youtube and facebook, etc. but by using psiphon3, office staff is able to circumvent the normal block and visit these websites. Is this because IT doesn't realize that its being bypassed, but it knows its being bypassed, or perhaps it does'nt even know that it does'nt know? Is this a sound manner of maintaining network anonymity, or are these page visits still being logged but just not amounting to trouble in the short run? Normally, penalties are dealt out if IT finds someone visiting social media, etc. during billable time. Thanks you. BobbyDay33 (talk) 11:37, 9 April 2014 (UTC)
- I've never used psiphon3 before, but its website says it uses "VPN, SSH and HTTP proxy technology to provide you with uncensored access to Internet content." What this means is that there's a computer sitting between your network and the internet. You're communicating directly with that computer and telling it what sites to go to, and that computer passes it on to you. Some HTTP proxies are not very secure, in which case the network operator may be able to see the traffic between you and it, but it seems likely that in this case it's encrypted (VPNs always are, as far as I know). So as long as the data is encrypted, only you, the VPN, and the site you're visiting are able to access it.
- That being said, VPNs and proxies are widely used and the IP addresses they use are frequently shared. So while your IT department wouldn't be able to see the site's you're going to, it'll look awfully suspicious that all of your traffic is going to and coming from a single other computer. Then if it's a known proxy and IT bothers to check up on it, it'll be clear that you're using a proxy. My guess is that's as worthy of a penalty as anything. --— Rhododendrites talk | 12:24, 9 April 2014 (UTC)
- That may be so, but there're dozens of people working at this office and the likelihood of them giving 25 people or more a penalty for doing the same thing when they've been doing it for over a year is unlikely, but I'm not here for legal advice. I went to the website and couldn't make any sense of the computer jibber jabber, but you're saying that it's untraceable. Seems pretty odd that something as sophistocated as a firewall can be so easily circumvented by something as silly as this VPN thing. DobbyBay33 (talk) 12:48, 9 April 2014 (UTC)
The normal practice would be to block the VPN itself for exactly that reason. Either they don't know about it or they're not very good at their jobs. I suppose the former implies the latter. -- BenRG (talk) 16:23, 9 April 2014 (UTC)- In defense of the IT department, BenRG, the "security hole" is in this case that a privileged user - an employee inside the firewall - is colluding with an outsider - the external proxy-server or VPN provider. In that sense, there is a privilege escalation by way of social engineering (making an external service that is appealing to users inside the corporate network). At best, the firewall is circumvented; at worst, a malicious attacker who runs the proxy-server now has an entrypoint into the secure network. If the employees inside the firewall had not intentionally opened up this avenue, the firewall would probably be secure from outside intrusion. As always, computer security devolves into a question of who should be trusted; in this instance, employees were permitted to open a VPN connection - that could be a legitimate business-need that should not be blocked. However, the IT department trusted that employees would not open a VPN connection to an external party; but some employees choose to do so anyway. A draconian department could start black-listing and white-listing individual socket-ports and individual employees; but the management overhead may outweigh the reward.
- IBM publishes a book series, the Red Books, and there are voluminous writings on the subject of Best Practices for IT departments - covering the technical, non-technical, and business strategies that work for enterprises. Here's a series on Firewall Best Practices for System Z. Microsoft also publishes a guideline, Windows Firewall Integration and Best Practices, which is geared towards individual users. The point is, an IT department's security strategy has to be considered as part of a larger picture whose requirements are driven by other-than-technical details. It's a bit mean-spirited to start lobbing around claims of incompetence; a lot of IT professionals know exactly what they're doing; and yet, they take flak from all sides for being too draconian, not draconian enough. The department becomes the social piranha of the corporation. Nimur (talk) 17:28, 9 April 2014 (UTC)
- Yes, you're right. -- BenRG (talk) 18:37, 9 April 2014 (UTC)
- That may be so, but there're dozens of people working at this office and the likelihood of them giving 25 people or more a penalty for doing the same thing when they've been doing it for over a year is unlikely, but I'm not here for legal advice. I went to the website and couldn't make any sense of the computer jibber jabber, but you're saying that it's untraceable. Seems pretty odd that something as sophistocated as a firewall can be so easily circumvented by something as silly as this VPN thing. DobbyBay33 (talk) 12:48, 9 April 2014 (UTC)
- Many IT departments chafe at the idea that they should be responsible for proactively policing the unprofessional behaviour of others, like some kind of cyberlogical Bottom Inspectors. Having issued the policy, and taken the reasonable technical step of blocking the sites at the firewall, the might reasonably say that, in a professional workplace, it's not their job to stop their coworkers from breaking that policy anyway. Particularly when someone has taken the overt and proactive step of setting up a VPN to wilfully circumvent that policy, they've opened themselves up to being fired for cause. -- Finlay McWalterᚠTalk 17:06, 9 April 2014 (UTC)
- Yes, what Finlay said, but using many more words! Nimur (talk) 17:33, 9 April 2014 (UTC)
- Seems pretty odd that something as sophistocated as a firewall can be so easily circumvented by something as silly as this VPN thing - well there are three ways that the security department could stop it - 1. stateful packet inspection, 2. Only allowing connections to specified hosts, 3. blocking proxies a) from a public proxy list or b) by testing outgoing hosts dynamically an maintaining an internal list. But why bother? You can visit these sites on your phone, and a moderate amount of fun makes people more productive. All the best, Rich Farmbrough, 23:51, 9 April 2014 (UTC).
control and scroll in outlook 2003
[edit]anyone know why i cant control+scroll (with mouse wheel) to zoom in on a message in outlook 2003? i can do this in everything but outlook 2003.3453451a (talk) 15:44, 9 April 2014 (UTC)
- It's a common feature in web browsers but not in other programs. Microsoft Outlook is not a browser. PrimeHunter (talk) 23:32, 9 April 2014 (UTC)
- I'm afraid the above is not correct. I can use Control+scroll to zoom in all my office 2003 programs, including excel, word and outlook, I have just confirmed. I have windows xp with office 2003 and can control+scroll messages, both when i open the message and even in the Outlook preview pane. As to why it is not working, that would require some troubleshooting. Can you confirm it works for you in MS Word? Vespine (talk) 23:54, 9 April 2014 (UTC)