Advantages and Limitations of CBC     a ciphertext block depends on all blocks before it any change to a block affects all following ciphertext blocks Initialization Vector (IV) : different IV hides patterns and repetitions Error propagation:  one error during encryption (rare) affects all subsequent blocks;  One error during transmission affects 2 blocks, the current one and the next one.
View full slide show




Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks on Blocks! Dynamically Rearranging Synteny Blocks in Comparative Genomes Nick Egan’s Final Project Presentation for BIO 131 Intro to Computational Biology Taught by Anna Ritz
View full slide show




Preamble “Post-amble” Block Execution: 3 Detail Observing Block Observing Block “Post-amble” “Post-amble” 3 Observing Block Observing Block ok Measurement Set ready “Post-amble” EVLA Data Processing PDR Observing Observing Block Block Observing Observing Block Block Failed! Preamble “Post-amble” Preamble ok ?4 5 Preamble ready Preamble Observing Observing Block Block Observing Observing Block Block Observing Block Observing Block Measurement Set “Post-amble” “Post-amble” Preamble Preamble “Post-amble” Measurement Set “Post-amble” “Post-amble” “Post-amble” July 18 - 19, 2002 2 2 Observing Observing Block Block Block Observing Observing Observing Block Block ok Archive: Preamble Observing Block Observing Block 34 ready Preamble “Post-amble” 1 3 Observing Block Observing Observing Block Block Observing Block Observing Observing Block Block ready Preamble Execution: Preamble ready Observing Observing Block Block Observing Observing Block Block Preamble Observing Block Observing Block 22 “Post-amble” “Post-amble” Preamble Preamble 1 “Post-amble” Preamble Input Queue: ok Measurement Set Boyd Waters 13
View full slide show




Transparent Scalability  Hardware is free to assign blocks to any SM (processor)  A kernel scales across any number of parallel processors Device Kernel grid Device Block 0 Block 1 Block 2 Block 3 Block 0 Block 1 Block 4 Block 5 Block 6 Block 7 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7 26 time Block 0 Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7 Each block can execute in any order relative to other blocks.
View full slide show




Transparent Scalability Hardware is free to assign blocks to any processor at any time  A kernel scales across any number of parallel processors Device Device Kernel grid Block 0 Block 1 Block 2 Block 3 Block 0 Block 2 Block 1 Block 3 Block 4 Block 5 Block 6 Block 7  Block 4 Block 5 Block 6 Block 7 time Block 0 Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7 Each block can execute in any CUDA Tools and Threads – Slide order relative 69
View full slide show




 Block Ciphers  A common mode of operation is cipher block chaining (CBC), in which each plaintext block is XORed with the previous block’s ciphertext before being encrypted.  Chapter 8 Cryptograhic Building Blocks The result is that each block’s ciphertext depends in part on the preceding blocks, i.e. on its context. Since the first plaintext block has no preceding block, it is XORed with a random number.  That random number, called an initialization vector (IV), is included with the series of ciphertext blocks so that the first ciphertext block can be decrypted. 18
View full slide show




RECALL: Cipher Block Chaining (CBC)  CBC generates its own random numbers  Have encryption of current block depend on result of previous block  c(i) = K ( m(i)  c(i-1) ) S  m(i) = KS( c(i))  c(i-1)  How do we encrypt first block?  Initialization vector (IV): random block = c(0)  IV does not have to be secret  Change IV for each message (or session)  Guarantees that even if the same message is sent repeatedly, the ciphertext will be completely different each time 35
View full slide show




Cipher Block Chaining (CBC)  CBC generates its own random numbers  Have encryption of current block depend on result of previous block  c(i) = KS( m(i)  c(i-1) )  m(i) = KS( c(i))  c(i-1)  How do we encrypt first block?   Initialization vector (IV): random block = c(0) IV does not have to be secret  Change IV for each message (or session)  Guarantees that even if the same message is sent repeatedly, the ciphertext will be completely different each time 21
View full slide show




5.1. Replica Placement: The First Baby Steps: The placement of replicas is critical to HDFS reliability and performance. Optimizing replica placement distinguishes HDFS from most other distributed file systems. This is a feature that needs lots of tuning and experience. The purpose of a rack-aware replica placement policy is to improve data reliability, availability, and network bandwidth utilization. The current implementation for the replica placement policy is a first effort in this direction. The short-term goals of implementing this policy are to validate it on production systems, learn more about its behavior, and build a foundation to test and research more sophisticated policies. Large HDFS instances run on a cluster of computers that commonly spread across many racks. Communication between two nodes in different racks has to go through switches. In most cases, network bandwidth between machines in the same rack is greater than network bandwidth between machines in different racks. The NameNode determines the rack id each DataNode belongs to via the process outlined in Rack Awareness: A simple but non-optimal policy is to place replicas on unique racks. This prevents losing data when an entire rack fails and allows use of bandwidth from multiple racks when reading data. This policy evenly distributes replicas in the cluster which makes it easy to balance load on component failure. However, this policy increases the cost of writes because a write needs to transfer blocks to multiple racks. For the common case, when the replication factor is three, HDFS’s placement policy is to put one replica on one node in the local rack, another on a different node in the local rack, and the last on a different node in a different rack. This policy cuts the inter-rack write traffic which generally improves write performance. The chance of rack failure is far less than that of node failure; this policy does not impact data reliability and availability guarantees. However, it does reduce the aggregate network bandwidth used when reading data since a block is placed in only two unique racks rather than three. With this policy, the replicas of a file do not evenly distribute across the racks. One third of replicas are on one node, two thirds of replicas are on one rack, and the other third are evenly distributed across the remaining racks. This policy improves write performance without compromising data reliability or read performance. The current, default replica placement policy described here is a work in progress. 5.2. Replica Selection: To minimize global bandwidth consumption and read latency, HDFS tries to satisfy a read request from a replica that is closest to the reader. If there exists a replica on the same rack as the reader node, then that replica is preferred to satisfy the read request. If angg/ HDFS cluster spans multiple data centers, then a replica that is resident in the local data center is preferred over any remote replica. 5.3. Safemode: On startup, the NameNode enters a special state called Safemode. Replication of data blocks does not occur when the NameNode is in the Safemode state. The NameNode receives Heartbeat and Blockreport messages from the DataNodes. A Blockreport contains the list of data blocks that a DataNode is hosting. Each block has a specified minimum number of replicas. A block is considered safely replicated when the minimum number of replicas of that data block has checked in with the NameNode. After a configurable percentage of safely replicated data blocks checks in with the NameNode (plus an additional 30 seconds), the NameNode exits the Safemode state. It then determines the list of data blocks (if any) that still have fewer than the specified # of replicas. NameNode then replicates blocks to other DataNodes. Distributed Databases Hadoop Computing Model Notion of trans: unit of work ACID props, CC Notion of jobL unit work No CC Data Model Struct data w known schema Read/Write mode Any data any format ReadOnly mode Cost Model - Expensive servers Cheap commodity mach Fault Tolerance - Failures are rare Recovery mechanisms Failure common ~1000s Simple efficient fault tol KeyCharacteristi Effic, optimizatns, fine-tuning Scalability, flex, fault tol Bigger Picture: Hadoop vs. Other Systems Cloud Computing Compute model where any compute infrastructure can run on cloud Hardware & Software provided as remote services Elastic: grows/shrinks based on user’s demand Example: Amazon EC2
View full slide show