Graph kernel: Kernel methods are a popular method with broad applications in data mining. In a simple way, a kernel function can be considered as a positive definite matrix that measures the similarities between each pair of input data. It the currently study, a graph kernel method, namely shortest-path kernel, developed by Borgwart and Kriegel, is used to compute the similarities between graphs. The first step of the shortest-path kernel is to transform original graphs into shortest-path graphs. A shortest-path graph has the same nodes as its original graph, and between each pair of nodes, there is an edge labeled with the shortest distance between the two nodes in the original graph. In the current study, the edge label will be referred to as the weight of the edge. This transformation can be done using any algorithm that solves the all-pairs-shortest-paths problem. In the current study, the Floyd-Warshall algorithm was used. Let G1 and G2 be two original graphs. They are transformed into shortest-path graphs S1(V1, E1) and S2(V2, E2), where V1 and V2 are the sets of nodes in S1 and S2, and E1 and E2 are the sets of edges in S1 and S2. Then a kernel function is used to calculate the similarity between G1 and G2 by comparing all pairs of edges between S1 and S2. where, kedge( ) is a kernel function for comparing two edges (including the node labels and the edge weight). Let e1 be the edge between nodes v1 and w1, and e2 be the edge between nodes v2 and w2. Then, where, knode( ) is a kernel function for comparing the labels of two nodes, and kweight( ) is a kernel function for comparing the weights of two edges. These two functions are defined as in Borgward et al.(2005): where, labels(v) returns the vector of attributes associated with node v. Note that Knode() is a Gaussian kernel function. was set to 72 by trying different values between 32 and 128 with increments of 2. where, weight(e) returns the weight of edge e. Kweight( ) is a Brownian bridge kernel that assigns the highest value to the edges that are identical in length. Constant c was set to 2 as in Borgward et al.(2005). Classification and cross-validation When the shortest-path graph kernel is used to compute similarities between graphs, the results are affected by the sizes of the graphs. Consider the case that graph G is compared with graphs Gx and Gy separately using the graph kernel: If Gx has more nodes than Gy does, then |Ex|>|Ey|, where Ex and Ey are the sets of edges in the shortest-path graphs of Gx and Gy. Therefore, the summation (i.e., SS( ) ) in K(G, Gx ) includes more items than the summation in K(G, Gy) does. Each item (i.e., kedge( )) inside the summation has a non-negative value. The consequence is that if K(G, Gx)>K(G,Gy) it may not necessary indicate that Gx is more similar to G than Gy is, in stead, it could be an artifact of the fact that Gx has more nodes than Gy. To overcome this problem, a voting strategy is developed for predicting whether a graph (or a patch) is an interface patch: Algoritm Voting_Stategy (G) Input: graph G Output: G is an interface patch or non-interface patch Let T be the set of proteins in the training setLet v be the number of votes given to “G is an interface patch” v=0 While (T is not empty) { Take one protein (P) out of T Let Gint and Gnon-int be the interface and non-interface patches from P. If K(G, Gint)>K(G,Gnon-int), then increase v by 1 } If , then G is an interface patch Else G is a non-interface patch Using this strategy, when K(G, Gint) is compared with K(G, Gnon-int), Gint and Gnon-int are guaranteed to have identical number of nodes, since they are the interface and non-interface patches extracted from the same protein (see section 2.4 for details). Each time K(G, Gint)>K(G, Gnon-int) is true, one vote is given to “G is an interface patch”. In the end G is predicted to be an interface patch if “G is an interface patch” gets more than half of the total votes, i.e.,. Leave-one-out cross-validation was performed at protein level. In one round of the experiment, the interface patch and non-interface patch of a
View full slide show




Modeling Convention for Subsystems and Interfaces Represent subsystems as three items in model: 1. <> package; 2. <> class, 3. subsystem interface (class with stereotype <>). Subsystem package provides a container for the elements in the subsystem. The interaction diagrams describe how the subsystem elements collaborate to implement the operations of the interface the subsystem realizes, Note: <> class actually realizes the interface and will orchestrate the implementation of the subsystem operations. ICourseCatalogSystem <> CourseCatalogSystem <> CourseCatalogSystem Different (additional) interfaces would have their own proxy! 11
View full slide show




Describing SubSystem Dependencies - Subsystems  Subsystem Dependencies on a SubSystem <> Client Support <> Server Support Flexible Server • When a subsystem uses some behavior of an element contained by another subsystem or package, a dependency on the external element is needed. • If the element on which subsystem depends is within a different subsystem, the dependency should be on that SubSystem interface, not on the element within the SubSystem, since we are denied entry to the subsystem. • We know the advantages of such a design… • It also gives the designer total freedom in designing the internal 7 behavior of
View full slide show




C:\UMBC\331\java> java.ext.dirs=C:\JDK1.2\JRE\lib\ext java.io.tmpdir=C:\WINDOWS\TEMP\ os.name=Windows 95 java.vendor=Sun Microsystems Inc. java.awt.printerjob=sun.awt.windows.WPrinterJob java.library.path=C:\JDK1.2\BIN;.;C:\WINDOWS\SYSTEM;C:\... java.vm.specification.vendor=Sun Microsystems Inc. sun.io.unicode.encoding=UnicodeLittle file.encoding=Cp1252 java.specification.vendor=Sun Microsystems Inc. user.language=en user.name=nicholas java.vendor.url.bug=http://java.sun.com/cgi-bin/bugreport... java.vm.name=Classic VM java.class.version=46.0 java.vm.specification.name=Java Virtual Machine Specification sun.boot.library.path=C:\JDK1.2\JRE\bin os.version=4.10 java.vm.version=1.2 java.vm.info=build JDK-1.2-V, native threads, symcjit java.compiler=symcjit path.separator=; file.separator=\ user.dir=C:\UMBC\331\java sun.boot.class.path=C:\JDK1.2\JRE\lib\rt.jar;C:\JDK1.2\JR... user.name=nicholas user.home=C:\WINDOWS C:\UMBC\331\java>java envSnoop -- listing properties -java.specification.name=Java Platform API Specification awt.toolkit=sun.awt.windows.WToolkit java.version=1.2 java.awt.graphicsenv=sun.awt.Win32GraphicsEnvironment user.timezone=America/New_York java.specification.version=1.2 java.vm.vendor=Sun Microsystems Inc. user.home=C:\WINDOWS java.vm.specification.version=1.0 os.arch=x86 java.awt.fonts= java.vendor.url=http://java.sun.com/ user.region=US file.encoding.pkg=sun.io java.home=C:\JDK1.2\JRE java.class.path=C:\Program Files\PhotoDeluxe 2.0\Adob... line.separator=
View full slide show




Modeling Convention: Subsystem Interaction Diagrams - General Subsystem Client Subsystem Proxy Design Element 1 Design Element 2 performResponsibility( ) Op1() subsystem responsibility Op2() Internal subsystem interactions Op3() Op4() message should be drawn from the <> client to the <> ote: interface does not appear on internal subsystem interaction diagram. emainder of diagram should model how the <> class delegates esponsibility for performing the invoked operation to the other subsystem elements.  Recommend you name the interaction diagram ::> his convention simplifies future tracing of interface behaviors to the classes implementing 17 nterface operations.
View full slide show




Operating-System Operations  Interrupt driven by hardware  Software error or request creates exception or trap  Division by zero, request for operating system service  Other process problems include infinite loop, processes modifying each other or the operating system  Dual-mode operation allows OS to protect itself and other system components  User mode and kernel mode  Mode bit provided by hardware  Provides ability to distinguish when system is running user code or kernel code  Some instructions designated as privileged, only executable in kernel mode  System call changes mode to kernel, return from call resets it to user Operating System Concepts with Java – 8th Edition 1.30 Silberschatz, Galvin and Gagne ©2009
View full slide show




Operating-System Operations Interrupt driven by hardware  Software error or request creates exception or trap  Division by zero, request for operating system service  Other process problems include infinite loop, processes modifying each other or the operating system  Dual-mode operation allows OS to protect itself and other system components  User mode and kernel mode  Mode bit provided by hardware  Provides ability to distinguish when system is running user code or kernel code  Some instructions designated as privileged, only executable in kernel mode  System call changes mode to kernel, return from call resets it to user  Operating System Concepts – 8th Edition 1.28 Silberschatz, Galvin and Gagne ©2009
View full slide show




Chapter 13: I/O Systems  I/O Hardware  Application I/O Interface  Kernel I/O Subsystem  Transforming I/O Requests to Hardware Operations Operating System Concepts with Java – 8th Edition 12.24 Silberschatz, Galvin and Gagne ©2009
View full slide show




Subsystem Decomposition into Layers Layer 1 A: Subsystem C:Subsystem B:Subsystem E:Subsystem F:Subsystem D:Subsystem Layer 2 G:Subsystem Layer 3 Ideally use one package for each subsystem   Subsystem Decomposition Heuristics: No more than 7+/-2 subsystems Why?  More subsystems increase cohesion but also complexity (more services)  No more than 4+/-2 layers, use 3 layers (good) Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java Why? 18
View full slide show




Chapter 13: I/O Systems  I/O Hardware  Application I/O Interface  Kernel I/O Subsystem  Transforming I/O Requests to Hardware Operations  Streams  Performance Operating System Concepts – 8th Edition 13.2 Silberschatz, Galvin and Gagne ©2009
View full slide show




Chapter 2: Operating-System Structures  Operating System Services  User Operating System Interface  System Calls  Types of System Calls  System Programs  Operating System Design and Implementation  Operating System Structure  Virtual Machines  Operating System Debugging  Operating System Generation  System Boot Operating System Concepts – 8th Edition 2.2 Silberschatz, Galvin and Gagne ©2009
View full slide show




Chapter 2: Operating-System Structures  Operating System Services  User Operating System Interface  System Calls  Types of System Calls  System Programs  Operating System Design and Implementation  Operating System Structure  Virtual Machines  Operating System Debugging  Operating System Generation  System Boot Operating System Concepts – 8th Edition 2.2 Silberschatz, Galvin and Gagne ©2009
View full slide show




Section 2. System Decomposition  Subsystem (UML: Package)  Collection of classes, associations, operations, events and constraints that are interrelated  Seed for subsystems: UML Objects and Classes.  (Subsystem) Service: From what spec.?  Group of operations provided by the subsystem  Seed for services: Subsystem use cases  Service is specified by Subsystem interface:  Specifies interaction and information flow from/to subsystem boundaries, but not inside the subsystem.  Should be well-defined and small.  Often called API: Application programmer’s interface, but this term should used during implementation, not during System Design Bernd Bruegge & Allen H. Dutoit Object-Oriented Software Engineering: Using UML, Patterns, and Java 15
View full slide show




HDFS (Hadoop Distributed File System) is a distr file sys for commodity hdwr. Differences from other distr file sys are few but significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides hi thruput access to app data and is suitable for apps that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS originally was infrastructure for Apache Nutch web search engine project, is part of Apache Hadoop Core http://hadoop.apache.org/core/ 2.1. Hardware Failure Hardware failure is the normal. An HDFS may consist of hundreds or thousands of server machines, each storing part of the file system’s data. There are many components and each component has a non-trivial prob of failure means that some component of HDFS is always non-functional. Detection of faults and quick, automatic recovery from them is core arch goal of HDFS. 2.2. Streaming Data Access Applications that run on HDFS need streaming access to their data sets. They are not general purpose applications that typically run on general purpose file systems. HDFS is designed more for batch processing rather than interactive use by users. The emphasis is on high throughput of data access rather than low latency of data access. POSIX imposes many hard requirements not needed for applications that are targeted for HDFS. POSIX semantics in a few key areas has been traded to increase data throughput rates. 2.3. Large Data Sets Apps on HDFS have large data sets, typically gigabytes to terabytes in size. Thus, HDFS is tuned to support large files. It provides high aggregate data bandwidth and scale to hundreds of nodes in a single cluster. It supports ~10 million files in a single instance. 2.4. Simple Coherency Model: HDFS apps need a write-once-read-many access model for files. A file once created, written, and closed need not be changed. This assumption simplifies data coherency issues and enables high throughput data access. A Map/Reduce application or a web crawler application fits perfectly with this model. There is a plan to support appending-writes to files in future [write once read many at file level] 2.5. “Moving Computation is Cheaper than Moving Data” A computation requested by an application is much more efficient if it is executed near the data it operates on. This is especially true when the size of the data set is huge. This minimizes network congestion and increases the overall throughput of the system. The assumption is that it is often better to migrate the computation closer to where the data is located rather than moving the data to where the app is running. HDFS provides interfaces for applications to move themselves closer to where the data is located. 2.6. Portability Across Heterogeneous Hardware and Software Platforms: HDFS has been designed to be easily portable from one platform to another. This facilitates widespread adoption of HDFS as a platform of choice for a large set of applications. 3. NameNode and DataNodes: HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is 1 blocks stored in a set of DataNodes. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction The NameNode and DataNode are pieces of software designed to run on commodity machines, typically run GNU/Linux operating system (OS). HDFS is built using the Java language; any machine that supports Java can run the NameNode or the DataNode software. Usage of the highly portable Java language means that HDFS can be deployed on a wide range of machines. A typical deployment has a dedicated machine that runs only the NameNode software. Each of the other machines in the cluster runs one instance of the DataNode software. The architecture does not preclude running multiple DataNodes on the same machine but in a real deployment that is rarely the case. The existence of a single NameNode in a cluster greatly simplifies the architecture of the system. The NameNode is the arbitrator and repository for all HDFS metadata. The system is designed in such a way that user data never flows through the NameNode. 4. The File System Namespace: HDFS supports a traditional hierarchical file organization. A user or an application can create directories and store files inside these directories. The file system namespace hierarchy is similar to most other existing file systems; one can create and remove files, move a file from one directory to another, or rename a file. HDFS does not yet implement user quotas or access permissions. HDFS does not support hard links or soft links. However, the HDFS architecture does not preclude implementing these features. The NameNode maintains the file system namespace. Any change to the file system namespace or its properties is recorded by the NameNode. An application can specify the number of replicas of a file that should be maintained by HDFS. The number of copies of a file is called the replication factor of that file. This info is stored by NameNode. 5. Data Replication: HDFS is designed to reliably store very large files across machines in a large cluster. It stores each file as a sequence of blocks; all blocks in a file except the last block are the same size. The blocks of a file are replicated for fault tolerance. The block size and replication factor are configurable per file. An application can specify the number of replicas of a file. The replication factor can be specified at file creation time and can be changed later. Files in HDFS are write-once and have strictly one writer at any time. The NameNode makes all decisions regarding replication of blocks. It periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode
View full slide show




Chapter 2: Operating-System Structures  Operating System Services  User Operating System Interface  System Calls  Types of System Calls  System Programs  Operating System Design and Implementation  Operating System Structure  Virtual Machines  Operating System Debugging  Operating System Generation  System Boot Operating System Concepts Essentials – 8th Edition 2.2 Silberschatz, Galvin and Gagne ©2011
View full slide show




Comparison with other methods Recently, Tjong and Zhou (2007) developed a neural network method for predicting DNA-binding sites. In their method, for each surface residue, the PSSM and solvent accessibilities of the residue and its 14 neighbors were used as input to a neural network in the form of vectors. In their publication, Tjong and Zhou showed that their method achieved better performance than other previously published methods. In the current study, the 13 test proteins were obtained from the study of Tjong and Zhou. Thus, we can compare the method proposed in the current study with Tjong and Zhou’s neural network method using the 13 proteins. Figure 1. Tradeoff between coverage and accuracy In their publication, Tjong and Zhou also used coverage and accuracy to evaluate the predictions. However, they defined accuracy using a loosened criterion of “true positive” such that if a predicted interface residue is within four nearest neighbors of an actual interface residue, then it is counted as a true positive. Here, in the comparison of the two methods, the strict definition of true positive is used, i.e., a predicted interface residue is counted as true positive only when it is a true interface residue. The original data were obtained from table 1 of Tjong and Zhou (2007), the accuracy for the neural network method was recalculated using this strict definition (Table 3). The coverage of the neural network was directly taken from Tjong and Zhou (2007). For each protein, Tjong and Zhou’s method reported one coverage and one accuracy. In contrast, the method proposed this study allows the users to tradeoff between coverage and accuracy based on their actual need. For the purpose of comparison, for each test protein, topranking patches are included into the set of predicted interface residues one by one in the decreasing order of ranks until coverage is the same as or higher than the coverage that the neural network method achieved on that protein. Then the coverage and accuracy of the two methods are compared. On a test protein, method A is better than B, if accuracy(A)>accuracy(B) and coverage (A)≥coverage(B). Table 3 shows that the graph kernel method proposed in this study achieves better results than the neural network method on 7 proteins (in bold font in table 3). On 4 proteins (shown in gray shading in table 3), the neural network method is better than the graph kernel method. On the remaining 2 proteins (in italic font in table 3), conclusions can be drawn because the two conditions, accuracy(A)>accuracy(B) and coverage (A)≥coverage(B), never become true at the same time, i.e., when coverage (graph kernel)>coverage(neural network), we have accuracy(graph kernel)accuracy(neural network). Note that the coverage of the graph kernel method increases in a discontinuous fashion as we use more patches to predict DNA-binding sites. One these two proteins, we were not able to reach at a point where the two methods have identical coverage. Given these situations, we consider that the two methods tie on these 2 proteins. Thus, these comparisons show that the graph kernel method can achieves better results than the neural network on 7 of the 13 proteins (shown in bold font in Table 3). Additionally, on another 4 proteins (shown in Italic font in Table 3), the graph kernel method ties with the neural network method. When averaged over the 13 proteins, the coverage and accuracy for the graph kernel method are 59% and 64%. It is worth to point out that, in the current study, the predictions are made using the protein structures that are unbound with DNA. In contrast, the data we obtained from Tjong and Zhou’s study were obtained using proteins structures bound with DNA. In their study, Tjong and Zhou showed that when unbound structures were used, the average coverage decreased by 6.3% and average accuracy by 4.7% for the 14 proteins (but the data for each protein was not shown).
View full slide show




Example of Subsystems Realization <> Subsystem Name Interface Subsystem Interface Subsystem might be AccountsReceivable, AccountsPayable, Billing, That is, a major hunk of functionality. BUT, a client of the subsystem does NOT have access to the individual Classes like in a Package. Rather, a client must go through the public Interface to the subsystem, which contains the signatures of the services provided within the subsystem. The contents of the subsystem are NOT directly accesses; They are protected; Only the services shown in the interface are made available to clients. OOAD Using the UML - Introduction to Object Orientation, v 4.2 Copyright  1998-1999 Rational Software, all rights reserved 21
View full slide show




Kernel Modules  Sections of kernel code that can be compiled, loaded, and unloaded independent of the rest of the kernel  A kernel module may typically implement a device driver, a file system, or a networking protocol  The module interface allows third parties to write and distribute, on their own terms, device drivers or file systems that could not be distributed under the GPL  Kernel modules allow a Linux system to be set up with a standard, minimal kernel, without any extra device drivers built in Operating System Concepts – 8th Edition 21.15 Silberschatz, Galvin and Gagne ©2009
View full slide show




Design Elements in Each Component You may combine this drawing with the previous drawing; otherwise, make this separate. For each component, you must also – as much as possible - include the classes and their properties/methods needed to ‘realize’ the interface. Recognize signatures in the subsystem interface must be accommodated by the classes or other components (along with other dependencies ‘they’ might have) in the subsystem. There will likely be additional methods within each class that are not part of the signature of the subsystem, but are needed within the subsystem itself to realize a specific signature of the subsystem interface. Add properties, the methods, Also include dependencies objects will experience with realizing the interface…(hence need of Maintain Database and anything else that good class diagrams). (name interface) will assist in realizing <> the interface. … 1..2 Addrec(xxxx, xx) bool UpdateRec(xx, xx) int DeleteREc(xxxxxx) etc…… * … … XXXX Package Showing a dependency between this object (in sub) and an object in another design element (package, here) The interface is realized by the components along with dependencies ‘they’ might have to realize one or more signatures in the subsystem interface.
View full slide show




“Software Defined Networking” approach to open it LB service IP routing service FW service Network Operating System Service Service Service Operating Operating System System Service Specialized Specialized Packet Packet Forwarding Forwarding Hardware Hardware Service Service Service Service Operating System System Specialized Packet Forwarding Forwarding Hardware Hardware Service Operating System System Service Specialized Packet Forwarding Forwarding Hardware Hardware Service Service Operating Operating System System Service Service Service Operating Operating System System Specialized Specialized Packet Packet Forwarding Hardware Forwarding Hardware Specialized Specialized Packet Packet Forwarding Forwarding Hardware Hardware
View full slide show




“Software Defined Networking” approach to open it App App App Network Operating System App App App Operating Operating System System App Specialized Specialized Packet Packet Forwarding Forwarding Hardware Hardware App App App App Operating System System Specialized Packet Forwarding Forwarding Hardware Hardware App Operating System System App Specialized Specialized Packet Packet Forwarding Forwarding Hardware Hardware App App Operating Operating System System App App App Operating Operating System System Specialized Specialized Packet Packet Forwarding Hardware Forwarding Hardware Specialized Specialized Packet Packet Forwarding Hardware Forwarding Hardware
View full slide show




Subsystem Responsibilities    Subsystem responsibilities defined by the interface it realizes  When a subsystem realizes an interface, it makes a commitment to support every operation defined by the interface.  Interface operations may be realized by  Internal class operations (which may require collaboration with other classes or subsystems)  An interface realized by a contained subsystem. <> ICourseCatalogSystem getCourseOfferings() subsystem responsibility <> CourseCatalogSystem 13
View full slide show




Operating System Services  Operating systems provide an environment for execution of programs and services to programs and users  One set of operating-system services provides functions that are helpful to the user:  User interface - Almost all operating systems have a user interface (UI).  Varies between Command-Line (CLI), Graphics User Interface (GUI), Batch  Program execution - The system must be able to load a program into memory and to run that program, end execution, either normally or abnormally (indicating error)  I/O operations - A running program may require I/O, which may involve a file or an I/O device  File-system manipulation - The file system is of particular interest. Programs need to read and write files and directories, create and delete them, search them, list file Information, permission management. Operating System Concepts – 8th Edition 2.4 Silberschatz, Galvin and Gagne ©2009
View full slide show




Chapter 13: I/O Systems  Overview  I/O Hardware  Application I/O Interface  Kernel I/O Subsystem  Transforming I/O Requests to Hardware Operations  STREAMS  Performance Operating System Concepts – 9th Edition 13.2 Silberschatz, Galvin and Gagne ©2013
View full slide show