Friday, May 28, 2010
Sinatra step 8
A) What is MongoDB? How does it relate to MongoHQ?
MongoDB bridges the gap between key-value stores (which are fast and highly scalable) and traditional RDBMS systems (which provide rich queries and deep functionality).
MongoDB (from "humongous") is a scalable, high-performance, open source, document-oriented database. Written in C++, MongoDB features Document-oriented storage, Full index support, replication and high availability, auto-sharding, Querying, Fast in place updates, Map/reduce, Grids, commercial support.
MongoHQ. A cloud-based hosted database solution for MongoDB allows you to easily and quickly create and interact with MongoDB instances.
Reference:
The Best Features of Document Databases, Key-Value Stores, and RDBMSes (2010), Retrieved on 23rd May 2010 from, http://www.mongodb.org/
B) What is Mongomapper?
MongoMapper is a Ruby wrapper library which aims to make using MongoDB much easier and friendlier than the default Ruby driver provided by XGen. When it makes sense to do so, MongoMapper tries to stick closely with the familiar syntax of ActiveRecord.
Because of the way MongoDB stores data, there are two key concepts when working with MongoMapper: the Document, and the EmbeddedDocument.
The Document is essentially a record with fields, just as you’d expect.
An Embedded Document is exactly like a Document, except that it is injected into a Document and still retains all of its information.
Because of the way MongoMapper works, all of this is transparent to you. With MongoMapper, when you retrieve a record with an embedded document attached to it, the embedded document’s information is retrieved along with it, which is where MongoMapper truly shines. By embedding when it makes sense to do so, you are able to retrieve all of the typical join information in one speedy query.
References:
Junemaker (January 1st, 2010), MangoMapper Retrieved on 23rd May, 2010 from,
http://wiki.github.com/jnunemaker/mongomapper/home/7
C) What is the relation between MongoDB and Mysql?
“Interesting about MongoDB is its approach to Durability, Data Consistency and Availability. It is very relaxed and will not work for some applications but for others it can be usable in current form. Let me explain some concepts and compare it to technologies in MySQL space.
First MongoDB is best compared no to MySQL Server but MySQL Cluster, especially in newer versions which implement "sharding". Same as commit to NDB Storage engine does not normally mean commit to disk, but rather commit to network it does not mean commit to disk with MongoDB, furthermore MongoDB uses Asynchronous replication, meaning it may take some time before data will be at more than one node. You can also use getLastError() to ensure data is propagated to the slave. So you can see it as a hybrid between MySQL Cluster and innodb_flush_log_at_trx_commit=2 mode. The second difference of course the fact MongoDB is not crash safe - similar to MyISAM database will need to be repaired if it crashes. Still I find behavior somewhat similar - you're not expected to run MySQL Cluster without replication, MongoDB is practically the same.
Second - if we look at Replication Sets we find them very similar to MySQL Cluster though designed to work with Wide area network and so Async replication. There is voting required to pick the master node in case of node failure and at least 3 servers is recommended, where you can have some voting servers only cast their votes and hold no data. The other different is there is only one master rather than multiple. This is because doing master with asynchronous replication requires conflict resolution which can be tricky in general sense and MongoDB wants simplicity of operation for developers and administration.
Third if we look at how failover happens - same with NDB (native API) it is handled on driver level. When you connect to replication set you connect to set of server not one of them and if one server fails driver fails over to different master. Things are again tuned to deal with Asynchronous Replication. Consistency is maintained but at expense of certain changes may be thrown away/ "rolled back" in case of fail over.” (Peter, 2010)
Reference:
Peter (1st May, 2010), Mongo DB Approach to Availability, Retrieved on 24th May 2010, from http://swik.net/mongodb+MySQL
exercise 9
1. Find out about SET and the use of RSA 128-bit encryption for e-commerce.
Answer:
SET:
Secure Electronic Transaction (SET) is a standard that will enable the secure transaction on the Internet. It gives the fundamental framework within which many of the various components of securing function of digital transactions. SET has been certified by most of the entire major groups in the electronic commerce arena, including Microsoft, Netscape, VISA and MasterCard. With the use of digital signatures, SET will make able merchants to confirm that buyers are who they state to be. And it will defend buyers by providing a system for their credit card number to be transferred directly to the credit card issuer for confirmation and billing without the merchant being able to see the number.
RSA 128-bit encryption:
RSA is most common used in electronic commerce protocols, and is assumed to be secure given sufficiently long keys and use of up to date implementations. RSA 128-bit encryption has standard characteristics that ensure that data stays secure. RSA encryption is the organizational standard for use in securing applications, and RSA 128-bit is broadly considered unbreakable. It is common used in communication between browser and the server to ensure the transmission is secured.
References:
Secure Electronic Transaction (2010), Retrieved on May 20, 2010 from http://e-comm.webopedia.com/TERM/S/Secure_Electronic_Transaction.html
Management Online. (Mar 8, 2005) BusinessObjects XI increases security by providing RSA 128-bit encryption as standadrd platform security level. Retrieved on May 20, 2010 from http://www.information-management.com/news/1022600-1.html
2. What can you find out about network and host-based intrusion detection systems?
Network Intrusion Detection Systems (NIDS)
NIDS monitors packets on the network wire and attempts to discover if a hacker/cracker is attempting to break into a system (or cause a denial of service attack). A typical example is a system that watches for large number of TCP connection requests (SYN) to many different ports on a target machine, thus discovering if someone is attempting a TCP port scan. A NIDS may run either on the target machine who watches its own traffic (usually integrated with the stack and services themselves), or on an independent machine promiscuously watching all network traffic (hub, router, probe). Note that a “network” IDS monitors many machines, whereas the others monitor only a single machine (the one they are installed on) (Graham, R., 2000)
Host-based intrusion detection system:
Different from NIDS, host-based IDS monitors all or parts of the dynamic behaviour and the state of a computer system (Wikipedia). Much as a NIDS will dynamically inspect network packets, a HIDS might detect which program accesses what resources and discover that, for example, a word-processor has suddenly and inexplicably started modifying the system password database. Similarly a HIDS might look at the state of a system, its stored information, whether in RAM, in the file system, log files or elsewhere; and check that the contents of these appear as expected. One can think of a HIDS as an agent that monitors whether anything or anyone, whether internal or external, has circumvented the system’s security policy.
References:
Graham, R. (2000). FAQ: Network Intrusion Detection System. Retrieved on May 20, 2010, from http://www.linuxsecurity.com/resource_files/intrusion_detection/network-intrusion-detection.html
Wikipedia, Host-based Intrusion Detection System. Retrieved on May 20, 2010, from http://en.wikipedia.org/wiki/Host_based_intrusion_detection_system
3. What is 'phishing'?
Phishing (pronounced "fishing") is a type of online identity theft. It uses e-mail and fraudulent Web sites that are designed to steal your personal data or information such as credit card numbers, passwords, account data, or other information.
Con artists might send millions of fraudulent e-mail messages with links to fraudulent Web sites that appear to come from Web sites you trust, like your bank or credit card Company, and request that you provide personal information. Criminals can use this information for many different types of fraud, such as to steal money from your account, to open new accounts in your name, or to obtain official documents using your identity.
Microsoft (2010), Phishing- general, Retrieved on May 20, 2010, from http://www.microsoft.com/protect/yourself/phishing/faq.mspx
4. What is SET and how does it compare to SSL as a platform for secure electronic transaction? Is SET in common use?
Answer:
Secure Electronic Transaction (SET) is a standard that will enable the secure transaction on the Internet. It gives the fundamental framework within which many of the various components of securing function of digital transactions.
It is noticeable that SET is the more secure protocol but with more security it is more complex and also cost more. Users must concern a digital wallet or certificate from the bank and remember the password all time. However, SSL is much easy to put into practice and accepted by online customers.
Even though SET has t he strong holds of two major league credit card companies, VISA and MasterCard, SSL is built into all major browsers and web servers, therefore just installing a digital certificate turns on their SSL capabilities. This makes SSL easier for a business to use at the outset. These are the sorts of market advantages that perhaps develop when a protocol like SSL has been invented by and has support of th major computer organizations like Microsoft and Netscape instead of conventional credit extending companies such as VISA and MasterCard.
In conclusion SSL is very easy to use and it is widely accepted and may be up coming protocol offers more protections. Secure electronic transactions will be an important part of electronic commerce in the future but the challenge is cost and complexity of SET.
References:
Johnny Papa (April, 2010), Secure Sockets Layer: Protect Your E-Commerce Web Site with SSL and Digital Certificates, Retrieved on May 20, 2010, from http://msdn.microsoft.com/en-us/magazine/cc301946.aspx
Ganesh Ramakrishnan (2000), Secure Electronic Transaction (SET) Protocol, Retrieved on May 20, 2010, from http://www.isaca.org/Template.cfm?Section=Home&CONTENTID=21545&TEMPLATE=/ContentManagement/ContentDisplay.cfm
5. What are cookies and how are they used to improve security? Can the use of cookies be a security risk?
Answer:
A cookie is in fact a file that is stored on the user’s hard drive which allows the user to distinguish that web page from others. Cookies are mostly used by electronic commerce websites where they store the user’s preferences so next time user doesn’t have to select them again.
The major problem of cookies is the information they contain. When a user connects to a website that can be personalized, user will be prompted with several questions in order to build a profile, this information is then stored in a cookie. Depending on the website, the manner in which this data is stored could end up being damaging to the user.
In fact, an online sales site could collect information on users' preferences by means of a questionnaire, so that they can suggest items that would be of interest to users.
A cookie is a way to create a link between the user's session (browsing certain pages of a website for a certain amount of time) and the data relating to the user.
Ideally, a cookie should contain a random chain (session identification), which is unique and difficult to decipher, and valid only for a given period of time. Only the server should be able to associate the user's preferences with the session identifier. Thus, when the session cookie expires, it becomes useless and should not contain any information relating to the user.
The cookie should never contain direct user information, and its lifespan should be as close as possible to the duration of the user's session.
On the other hand, the data stored in the cookie is sent to the server, to the database where the user entered his data. Thus, the cookie should never contain user information that the user hasn't given him, nor information on contents of the computer, in other words, the cookie should not collect information from the user's computer.
So, always refuse to give personal information to a website that does not seem legitimate, as it has no right to collect your personal information.
A cookie is not a dangerous file in itself if it is well designed and if the user does not provide personal data.
References:
(n.d) (October 16, 2008 02:43:18 PM), Security – Cookies, Retrieved on May 20, 2010, from http://en.kioskea.net/contents/securite/cookies.php3
6. What makes a firewall a good security investment? Accessing the Internet, find two or three firewall vendors. Do they provide hardware, software or both?
Answer:
Firewall is a system designed to prevent unauthorized access to or from a private network. Firewalls can be implemented in both hardware and software, or a combination of both. Firewalls are frequently used to prevent unauthorized internet users from accessing private networks connected to the Internet, especially intranets. All messages entering or leaving the intranet pass through the firewall, which examines each message and blocks those that do not meet the specified security criteria.
There are several types of firewall techniques, according to (webopedia, 2009):
Packet filter: Looks at each packet entering or leaving the network and accepts or rejects it based on user-defined rules. Packet filtering is fairly effective and transparent to users, but it is difficult to configure. In addition, it is susceptible to IP spoofing.
Application gateway: Applies security mechanisms to specific applications, such as FTP and Telenet servers. This is very effective, but can impose a performance degradation.
Circuit-level gateway:Applies security mechanisms when a TCP or UDP connection is established. Once the connection has been made, packets can flow between the hosts without further checking.
Proxy server: Intercepts all messages entering and leaving the network. The proxy server effectively hides the true network addresses.
In practice, many firewalls use two or more of these techniques in concert, and usually, a firewall is considered a first line of defense in protecting private information.
The following Firewall Vendors are found in the internet:
i)CISCO
- provide both the hardware and software in multiple integrated solutions. CISCO’s firewall products are based on modular, scalable platforms, and each firewall is designed to secure varying network environment.
ii) IBM
- Unlike CISCO, IBM does not focus on hardware, but rather is part of a broad range of security products and services, from software to hardware to consulting. It provides different solutions for customers using others suppliers’ firewall hardware.
iii) NEC
- being an worldwide network equipment vendor, NEC provides both firewall solution in hardware and software. It is specialized in security software bundled with firewall and collaboration with different brands’ firewall products.
References:
Firewall (2010), Retrieved on May 20, 2010, from http://www.webopedia.com/TERM/F/firewall.html
7. What measures should e-commerce provide to create trust among their potential customers? What measures can be verified by the customer?
Answer:
In the e-commerce business, securing trust in a company is essential to its success. Trust is as important to a potential customer’s purchasing decision as the products the company offers him. And an essential element of building that trust, with both customers and partners, is the assurance that the e-commerce operation meets the demanding security standards required of organizations handling sensitive financial information.
i) Authentication
Customers must be able to assure themselves that they are in fact doing business and sending private information with a real identity – not a “spoof” site masquerading as a legitimate bank or e-store.
ii) Confidentiality
Sensitive Internet communications and transactions, such as the transmission of credit card information, must be kept private.
iii) Data integrity
Communications must be protected from undetectable alteration by third parties in ransmission on the Internet.
iv) Nonrepudiation
It should not be possible for a sender to reasonably claim that he or she did not send a secured communicaito or did not make an online purchase
Digital certificate, email confirmation, and online enquiry could help customers to verify that the security measures are taken in an e-commerce environment.
References:
Building an E-Commerce Trust Infrastructure (n.d), Retrieved on May 20, 2010, from
8. Get the latest PGP information from http://en.wikipedia.org/wiki/Pretty_Good_Privacy.
The use of digital certificates and passports are just two examples of many tools for validating legitimate users and avoiding consequences such as identity theft. What others exist?
Answer:
The latest PGP information is PGP Corporation and Protegrity Partner to Provide Continuous End-to-End Security of Sensitive Data. The unique PGP Corporation and Protegrity approach leverages the three pillars of total protection for sensitive data: (1) end-to-end encryption for any kind of sensitive data in any location; (2) automated key management; and (3) centralized administration and reporting to address compliance. The integration of PGP Corporation’s market-leading suite of trusted data protection solutions to protect sensitive data wherever it goes with Protegrity’s product offerings for those customers that require database encryption, tokenization, format-controlled encryption, and masking delivers the only true end-to-end data protection solution of its kind in today’s market.
Besides digital certificates and passport, identify theft could be avoided by enrolling in an identity insurance, identity guard, or registration / enquiry on Identity Theft Knowledge Centre.
References:
PGP Corporation and Protegrity Partner to Provide Continuous End-to-End Security of Sensitive Data (19th May, 2010), Retrieved on May 20, 2010, from http://www.pgp.com/insight/newsroom/press_releases/pgp_protegrity.html
exercise 10
1) Find definitions for eight terms and concepts used in threaded programming:
Thread Synchronisation
Thread Synchronisation is to synchronize access to shared resources, and otherwise coordinate execution of threads. It is especially useful in avoiding conflicts when more than one thread needs to access a single variable or other resources.
Locks
The Java programming language provides multiple mechanisms for communicating between threads. The most basic of these methods is synchronization, which is implemented using monitors. Each object in Java is associated with a monitor, which a thread can lock or unlock. Only one thread at a time may hold a lock on a monitor. Any other threads attempting to lock that monitor are blocked until they can obtain a lock on that monitor. A thread t may lock a particular monitor multiple times; each unlock reverses the effect of one lock operation. (Sun, 2005)
Deadlock
For application programming, as opposed to server implementation, thread pools pose some concurrency risks. The reason is that the tasks making up an application tend to be dependent on each other. In particular, deadlock is a significant concern. A deadlock occurs when a set of threads creates a cycle of waiting and in such case; threads will be hold up and cannot proceed.
Semaphores
According to (Sun, 2009), Semaphores are a programming construct designed by E. W. Dijkstra in the late 1960s. Dijkstra’s model was the operation of railroads: consider a stretch of railroad in which there is a single track over which only one train at a time is allowed. Guarding this track is a semaphore. A train must wait before entering the single track until the semaphore is in a state that permits travel. When the train enters the track, the semaphore changes state to prevent other trains from entering the track. A train that is leaving this section of track must again change the state of the semaphore to allow another train to enter.
In the computer version, a semaphore appears to be a simple integer. A thread waits for permission to proceed and then signals that it has proceeded by performing a P operation on the semaphore.
Mutex (mutual exclusion)
Mutexis an abbreviation for “mutual exclusion”. Mutex variables are one of the primary means of implementing thread synchronization and for protecting shared data when multiple writes occur. A mutex variable acts like a “lock” protecting access to a shared data resource. The basic concept of a mutex as used in Pthreads is that only one thread can lock (or own) a mutex variable at any given time. Thus, even if several threads try to lock a mutex only one thread will be successful. No other thread can own that mutex until the owning thread unlocks that mutex. Threads must “take turns” accessing protected data (Barney, 2009)
Thread
A Thread is simply an agent spawned by the application to perform work independent of the parent process. While the term Thread and threading have referred to the concept of spawning (forking) multiple processes, more frequently they refer specifically to a pthread, or a worker which is spawned by the parent process, and which shares that parent’s resources. And Threads fundamentally differ from processes in that they are “light weight” and “share memory”.
Event
According to wikipedia, an event is an action that is usually initiated outside the scope of a program and that is handled by a piece of code inside the program. Typically events are handled synchronous with the program flow, that is, the program has one or more dedicated places where events are handled. Typical sources of events include the user (who presses a key on the keyboard, in other words, through a keystroke). Another source is a hardware devices such as a timer. A computer program that changes its behavior in response to events is said to be event-driven, often with the goal of being interactive.
Waitable timer
A waitable timer objectis a synchronization object whose state is set to signaled when the specified due time arrives (MSDN, 2009).
References:
Sun Microsystems, (2005). Threads and Locks. Retrieved on May 21st, 2010 from, http://java.sun.com/docs/books/jls/third_edition/html/memory.html
Barneym B., (2009). POSIX Threads Programming. Retrieved on May 21st, 2010 from, https://computing.llnl.gov/tutorials/pthreads/
Sun Microsystems, (2009). Semaphores. Retrieved on May 21st, 2009 from, http://docs.sun.com/app/docs/doc/805-5080/6j4q7emgq?a=view
Wikipedia, Event. Retrieved on May 21st, 2010 from, http://en.wikipedia.org/wiki/Event_(computing)
MSDN, (2009). Waitable Timer Objects. Retrieved on May 21st, 2010 from, http://msdn.microsoft.com/en-us/library/ms687012(VS.85).aspx
2) A simple demonstration of the threading module in Python that uses both a lock and semaphore to control currency is by Ted Herman at the University of lowa. The code and sample output below are worth a look. Report your findings.
The Python code by Ted Herman is simple and a very good illustration on lock and semaphore.
Coding on lock:
mutex=threading.RLock()
mutex.acquire()
mutex.release()
Coding on semaphore:
sema=threading.BoundedSemaphore(value=3)
sema.acquire()
sema.release()
Right after the lock statement mutex.acquire() , the flag indicator running is either increment or decrement, then a release lock must be followed: mutex.release(). But in the semaphore case, there could be several lock threads in between the acquire and release statements.
The semaphore here acts as an counting semaphore which is a counter for a set of available resources, rather than a locked/unlocked flag of a single resource.
exercise 11
1. Give a description in your own words of the ACID properties of a transaction.
Atomicity ensures that the transactions done on a database comply to “all or nothing” rule. Rollback of the database is possible when having failure in any part of the transaction.
Consistency refers to the transactions done on a database are operating on a consistent view of the data, before and after they leave the data.
Isolation means all other operations or processes cannot see the completed state or effect of a data during transaction until the transaction is finished.
Durability applies when a transaction is committed; it is guaranteed to persist even in the system failures. Rollback is possible if it is not committed, i.e. not durable.
2. Describe a TP monitor environment. How can a TP monitor stop an operating system being overwhelmed?
(TeleProcessing monitor or Transaction Processingmonitor) is a control program that manages the transfer of data between multiple local and remote terminals and the application programs that serve them. It may also include programs that format the terminal screens and validate the data entered.
In a distributed client/server environment, a TP monitor provides integrity by ensuring that transactions do not get lost or damaged. It may be placed in a separate machine and used to balance the load between clients and various application servers and database servers. It is also used to create a high availability system by switching a failed transaction to another machine. A TP monitor guarantees that all databases are updated from a single transaction. (PC Magazine, 2009)
TP monitor make sure that if the incoming clients requests surpass the number of processes in a server class, the TP monitor dynamically start new ones and this is called Load balancing. Across the computer network the distributing processing and communications activity keep that not even single device is overwhelmed.
References:
PC Magazine, (2009). TP Monitor. Retrieved on May 22nd, 2010 from
http://www.pcmag.com/encyclopedia_term/0,2542,t=TP+monitor&i=53022,00.asp
Wikipedia, LoadBalancing., Retrieved May 22nd, 2010 from,
exercise 12
exercise 13
exercise 14
The program which reads the web pages present on a website and identifies other pages by using the hypertext links on the web page is called Spider, also referred to as webcrawler because it crawls through the web pages. The web crawling technique is utilized to show results from different websites by search engines like Alta Vista. Along with that, it is a type of software agent that initiates with a list of URLs and tracks the hyperlinks on those URLs to open those other pages. Links are added to its list of URLs by it called crawler frontier. In accordance to a set of policies, the URLs from this crawler frontier are visited recursively.
Reference :
Web crawler. Wikipedia. 2010.Wikimedia Foundation, Inc. from http://en.wikipedia.org/wiki/
2. Differentiate the various types of software agents.
The software which acts as an authority to take decision as to which action is correct or appropriate for the user or other program in relationship of agency is called as software agent.
The various types of software agents are buyer agent, user agent, monitoring and surveillance agent and data mining agent.
i. Buyer agent
Buyer agent, also known as shopping bots, is the one which finds out information about goods and services from the internet. It works very efficiently for commodity products like books, electrical components etc.
ii. User agent
User agent, also known as personal agent, is the one which takes actions on your behalf. Some of the actions it can perform are as follows:
· It can find out information on different subjects.
· It can check emails, sort them in order of preference and give alerts on receiving important emails.
· It can automatically fill forms on the internet and store information for future reference.
· It can scan web pages and highlight text which contains the important part.
· It can play as opponent in computer games or can also patrol game areas.
· It can bring together customized news reports. Newshub and CNN are amongst several versions of these.
iii. Monitoring and surveillance Agent:
Monitoring and Surveillance Agents is an agent which is used to make observations and report on equipments, mostly computer systems. This agent can perform actions like:
· keeping records of inventory levels of the company,
· watching competitor’s prices and transmit them to the company,
· Observing stock changes by insider trading and rumors etc.
iv. Data Mining Agent:
Data Mining agent is an agent which makes use of information technology to search patterns and trends, from various different sources, in a large amount of information. In order to search the required information, the user can sort through this information.
Reference :
Software Agent Wikipedia. 2010.Wikimedia Foundation, Inc.from http://en.wikipedia.org/wiki/
3. Identify various activities in e-commerce where software agents are currently in use.
The various activities in e-commerce where software agents are used are as follows:
i. Buyer Agent:
A good example of a shopping bot is Amazon.com or eBay.com. Here, on the basis of goods and products that we are buying now or have bought earlier, the website will give a list of products that we might like to buy. The list offers items according to our taste and choice based on our searches or products we have previously bought.
ii. User Agent:
Yahoo.com, Google.com, msn.com can be examples of user agents. According to preference, the mails can be customized. They can also play games, assemble news etc.
iii. Monitoring and Surveillance Agent :
Example, An agent with NASA's Jet Propulsion Laboratory which checks planning, inventory, and schedules equipment ordering to cut down costs, as well as food storage facilities. The track of the configuration of each computer connected to the network of computers can be kept by the monitoring carried out by these agents.
iv. Data Mining Agent:
An example of data mining agent can be an agent developed by a corporation to analyse economic trends. It will alert the management in case of any changes like the consumers becoming more conservative. Hence, with this information, the management can make proper plans and take decisions on how to produce, market and sell its products. This will make all the processes from production to sale very efficient.
Reference:
Data Mining Agent. Wikipedia. 2010. Wikimedia Foundation, Inc.from http://en.wikipedia.org/wiki/
Software Agent. Wikipedia. 2010.Wikimedia Foundation, Inc, from http://en.wikipedia.org/wiki/