Marti Hearst

Marti Hearst

Marti Alice Hearst is a professor in the School of Information at the University of California, Berkeley. She did early work in corpus-based computational linguistics, including some of the first work in automating sentiment analysis, and word sense disambiguation. She invented an algorithm that became known as "Hearst patterns" which applies lexico-syntactic patterns to recognize hyponymy (ISA) relations with high accuracy in large text collections, including an early application of it to WordNet; this algorithm is widely used in commercial text mining applications including ontology learning. Hearst also developed early work in automatic segmentation of text into topical discourse boundaries, inventing a now well-known approach called TextTiling. Hearst's research is on user interfaces for search engine technology and big data analytics. She did early work in user interfaces and information visualization for search user interfaces, inventing the TileBars query term visualization. Her Flamenco research project investigated and developed the now widely used faceted navigation approach for searching and browsing web sites and information collections. She wrote the first academic book on the topic of Search User Interfaces (Cambridge University Press, 2009). Hearst is an Edge Foundation contributing author and a member of the Usage panel of the American Heritage Dictionary of the English Language. Hearst received her B.A., M.S., and Ph.D. in computer science, all from Berkeley. In 2013 she became a fellow of the Association for Computing Machinery. She became a member of the CHI Academy in 2017, and has previously served as president of the Association for Computational Linguistics and on the advisory council of NSF's CISE Directorate. Additionally, she has been a member of the Web Board for CACM, the Usage Panel for the American Heritage Dictionary, the Edge.org panel of experts, the research staff at Xerox PARC, and the boards of ACM Transactions on the Web, Computational Linguistics, ACM Transactions on Information Systems, and IEEE Intelligent Systems. Hearst has received an NSF CAREER award, an IBM Faculty Award, and an Okawa Foundation Fellowship. Her work on user interfaces has had a profound impact on the industry, earning Hearst two Google Research Awards and four Excellence in Teaching Awards.} She has also led projects worth over $3.5M in research grants. Hearst’s publications date back to 1990, when ‘A Hybrid Approach to Restricted Text Interpretation’ was published in Stanford University’s AAAI Spring Symposium on Text Based Intelligent Systems in March of that year.

Separable filter

A separable filter in image processing can be written as product of two more simple filters. Typically a 2-dimensional convolution operation is separated into two 1-dimensional filters. This reduces the computational costs on an N × M {\displaystyle N\times M} image with a m × n {\displaystyle m\times n} filter from O ( M ⋅ N ⋅ m ⋅ n ) {\displaystyle {\mathcal {O}}(M\cdot N\cdot m\cdot n)} down to O ( M ⋅ N ⋅ ( m + n ) ) {\displaystyle {\mathcal {O}}(M\cdot N\cdot (m+n))} . == Examples == 1. A two-dimensional smoothing filter: 1 3 [ 1 1 1 ] ∗ 1 3 [ 1 1 1 ] = 1 9 [ 1 1 1 1 1 1 1 1 1 ] {\displaystyle {\frac {1}{3}}{\begin{bmatrix}1\\1\\1\end{bmatrix}}{\frac {1}{3}}{\begin{bmatrix}1&1&1\end{bmatrix}}={\frac {1}{9}}{\begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix}}} 2. Another two-dimensional smoothing filter with stronger weight in the middle: 1 4 [ 1 2 1 ] ∗ 1 4 [ 1 2 1 ] = 1 16 [ 1 2 1 2 4 2 1 2 1 ] {\displaystyle {\frac {1}{4}}{\begin{bmatrix}1\\2\\1\end{bmatrix}}{\frac {1}{4}}{\begin{bmatrix}1&2&1\end{bmatrix}}={\frac {1}{16}}{\begin{bmatrix}1&2&1\\2&4&2\\1&2&1\end{bmatrix}}} 3. The Sobel operator, used commonly for edge detection: [ 1 2 1 ] ∗ [ 1 0 − 1 ] = [ 1 0 − 1 2 0 − 2 1 0 − 1 ] {\displaystyle {\begin{bmatrix}1\\2\\1\end{bmatrix}}{\begin{bmatrix}1&0&-1\end{bmatrix}}={\begin{bmatrix}1&0&-1\\2&0&-2\\1&0&-1\end{bmatrix}}} This works also for the Prewitt operator. In the examples, there is a cost of 3 multiply–accumulate operations for each vector which gives six total (horizontal and vertical). This is compared to the nine operations for the full 3x3 matrix. Another notable example of a separable filter is the Gaussian blur whose performance can be greatly improved the bigger the convolution window becomes.

GCube system

gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure. == Overview == It is primarily organised in a number of web service called to offer functionality supporting the phases of knowledge production and sharing. In addition, it consists of a set of software libraries supporting service development, service-to-service integration, and service capabilities extension, and a set of portlets dedicated to realise user interface constituents facilitating the exploitation of one or more services. It is designed and conceived to enact system of systems. In fact, its gCube services rely on standards and mediators to interact with other services as well as are made available by standard and APIs to make it possible for clients to use them. For instance, the DataMiner service implements the Web Processing Service protocol to facilitate clients to execute processes. The set of components dealing with Identity and Access Management rely on Keycloak and federates other IDMs thus making the overall Authentication and the Authorization management compliant with open standards such as OAuth2, User-Managed Access (UMA), and OpenID Connect (OIDC)protocols. The Catalogue relies on DCAT, OAI-PMH, and Catalogue Service for the Web to collect contents from other catalogues and data sources and offers its content by DCAT, OAI-PMH, and a proprietary REST API (gCat REST API). Its Continuous Integration/Continuous Delivery pipeline implemented by Jenkins represents an innovative approach to software delivering conceived to be scalable and easy to maintain and upgrade at a minimal cost. == History == gCube has been developed in the context of the D4Science initiative with the support of several EU projects.

Color space

A color space is a specific organization of colors. In combination with color profiling supported by various physical devices, it supports reproducible representations of color – whether such representation entails an analog or a digital representation. A color space may be arbitrary, i.e. with physically realized colors assigned to a set of physical color swatches with corresponding assigned color names (including discrete numbers in – for example – the Pantone collection), or structured with mathematical rigor (as with the NCS System, Adobe RGB and sRGB). A "color space" is a useful conceptual tool for understanding the color capabilities of a particular device or digital file. When trying to reproduce color on another device, color spaces can show whether shadow/highlight detail and color saturation can be retained, and by how much either will be compromised. A "color model" is an abstract mathematical model describing the way colors can be represented as tuples of numbers (e.g. triples in RGB or quadruples in CMYK); however, a color model with no associated mapping function to an absolute color space is a more or less arbitrary color system with no connection to any globally understood system of color interpretation. Adding a specific mapping function between a color model and a reference color space establishes within the reference color space a definite "footprint", known as a gamut, and for a given color model, this defines a color space. For example, Adobe RGB and sRGB are two different absolute color spaces, both based on the RGB color model. When defining a color space, the usual reference standard is the CIELAB or CIEXYZ color spaces, which were specifically designed to encompass all colors the average human can see. Since "color space" identifies a particular combination of the color model and the mapping function, the word is often used informally to identify a color model. However, even though identifying a color space automatically identifies the associated color model, this usage is incorrect in a strict sense. For example, although several specific color spaces are based on the RGB color model, there is no such thing as the singular RGB color space. == History == In 1802, Thomas Young postulated the existence of three types of photoreceptors (now known as cone cells) in the eye, each of which was sensitive to a particular range of visible light. Hermann von Helmholtz developed the Young–Helmholtz theory further in 1850: that the three types of cone photoreceptors could be classified as short-preferring (blue), middle-preferring (green), and long-preferring (red), according to their response to the wavelengths of light striking the retina. The relative strengths of the signals detected by the three types of cones are interpreted by the brain as a visible color. But it is not clear that they thought of colors as being points in color space. The color-space concept was likely due to Hermann Grassmann, who developed it in two stages. First, he developed the idea of vector space, which allowed the algebraic representation of geometric concepts in n-dimensional space. Fearnley-Sander (1979) describes Grassmann's foundation of linear algebra as follows: The definition of a linear space (vector space)... became widely known around 1920, when Hermann Weyl and others published formal definitions. In fact, such a definition had been given thirty years previously by Peano, who was thoroughly acquainted with Grassmann's mathematical work. Grassmann did not put down a formal definition—the language was not available—but there is no doubt that he had the concept. With this conceptual background, in 1853, Grassmann published a theory of how colors mix; it and its three color laws are still taught, as Grassmann's law. As noted first by Grassmann... the light set has the structure of a cone in the infinite-dimensional linear space. As a result, a quotient set (with respect to metamerism) of the light cone inherits the conical structure, which allows color to be represented as a convex cone in the 3- D linear space, which is referred to as the color cone. == Examples == Colors can be created in printing with color spaces based on the CMYK color model, using the subtractive primary colors of pigment (cyan, magenta, yellow, and key [black]). To create a three-dimensional representation of a given color space, we can assign the amount of magenta color to the representation's X axis, the amount of cyan to its Y axis, and the amount of yellow to its Z axis. The resulting 3-D space provides a unique position for every possible color that can be created by combining those three pigments. Colors can be created on computer monitors with color spaces based on the RGB color model, using the additive primary colors (red, green, and blue). A three-dimensional representation would assign each of the three colors to the X, Y, and Z axes. Colors generated on a given monitor will be limited by the reproduction medium, such as the phosphor (in a CRT monitor) or filters and backlight (LCD monitor). Another way of creating colors on a monitor is with an HSL or HSV color model, based on hue, saturation, brightness (value/lightness). With such a model, the variables are assigned to cylindrical coordinates. Many color spaces can be represented as three-dimensional values in this manner, but some have more, or fewer dimensions, and some, such as Pantone, cannot be represented in this way at all. == Conversion == Color space conversion is the translation of the representation of a color from one basis to another. This typically occurs in the context of converting an image that is represented in one color space to another color space, the goal being to make the translated image look as similar as possible to the original. == RGB density == The RGB color model is implemented in different ways, depending on the capabilities of the system used. The most common incarnation in general use as of 2021 is the 24-bit implementation, with 8 bits, or 256 discrete levels of color per channel. Any color space based on such a 24-bit RGB model is thus limited to a range of 256×256×256 ≈ 16.7 million colors. Some implementations use 16 bits per component for 48 bits total, resulting in the same gamut with a larger number of distinct colors. This is especially important when working with wide-gamut color spaces (where most of the more common colors are located relatively close together), or when a large number of digital filtering algorithms are used consecutively. The same principle applies for any color space based on the same color model, but implemented at different bit depths. == Lists == CIE 1931 XYZ color space was one of the first attempts to produce a color space based on measurements of human color perception (earlier efforts were by James Clerk Maxwell, König & Dieterici, and Abney at Imperial College) and it is the basis for almost all other color spaces. The CIERGB color space is a linearly-related companion of CIE XYZ. Additional derivatives of CIE XYZ include the CIELUV, CIEUVW, and CIELAB. === Generic === RGB uses additive color mixing, because it describes what kind of light needs to be emitted to produce a given color. RGB stores individual values for red, green and blue. RGBA is RGB with an additional channel, alpha, to indicate transparency. Common color spaces based on the RGB model include sRGB, Adobe RGB, ProPhoto RGB, scRGB, and CIE RGB. CMYK uses subtractive color mixing used in the printing process, because it describes what kind of inks need to be applied so the light reflected from the substrate and through the inks produces a given color. One starts with a white substrate (canvas, page, etc.), and uses ink to subtract color from white to create an image. CMYK stores ink values for cyan, magenta, yellow and black. There are many CMYK color spaces for different sets of inks, substrates, and press characteristics (which change the dot gain or transfer function for each ink and thus change the appearance). YIQ was formerly used in NTSC (North America, Japan and elsewhere) television broadcasts for historical reasons. This system stores a luma value roughly analogous to (and sometimes incorrectly identified as) luminance, along with two chroma values as approximate representations of the relative amounts of blue and red in the color. It is similar to the YUV scheme used in most video capture systems and in PAL (Australia, Europe, except France, which uses SECAM) television, except that the YIQ color space is rotated 33° with respect to the YUV color space and the color axes are swapped. The YDbDr scheme used by SECAM television is rotated in another way. YPbPr is a scaled version of YUV. It is most commonly seen in its digital form, YCbCr, used widely in video and image compression schemes such as MPEG and JPEG. xvYCC is an international digital video color space standard published by the IEC (IEC 61966-2-4). It is based on the ITU BT.601 and BT.709

Knowledge as a service

Knowledge as a service (KaaS) is a computing service that delivers information to users, backed by a knowledge model, which might be drawn from a number of possible models based on decision trees, association rules, or neural networks. A knowledge as a service provider responds to knowledge requests from users through a centralised knowledge server, and provides an interface between users and data owners. KaaS is one of several cloud computing-dependent business models in which computer resources are sold on an on-demand and pay-as-you-use basis. == Overview == At the International Semantic Web Conference 2019, it was described how knowledge can be made live and evolve on the web allowing users to learn directly from elaborated knowledge, now appearing in the form of knowledge graphs. KaaS appear when knowledge graphs are accessed via services This is opposed to DaaS which might "compute large volumes of data; integrate and analyzes that data; and publish it in real-time, using Web service APIs" (from Data as a Service) where the KaaS is able to exploit context - both the context of the user in relation to their information requests of the KaaS (where and when they make the request) and also the context of the information in relation to some objective or purpose of the users either understood by the KaaS automatically or indicated to it by the user. == Differentiating knowledge from data == Conceptual models that make such a differentiation such as the so-called DIKW pyramid have existed for perhaps more than 40 years (see a 1974 journal article about this) however definitions are not stable and universally accepted (see the discussion about the conceptualizations of DIKW within the DIKW Wikipedia article that question value of wisdom). The knowledge component of DIKW is generally agreed to be an elusive concept which is difficult to define, however Rowley 2007, in a well known student textbook differentiated knowledge from data by stating that knowledge is "defined with reference to information" and that it contains more than just facts but also "beliefs and expectations". In relation to knowledge graphs, knowledge may be additional content they provide over and above pure data which is the definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse (see the definition of Ontology). The ability to represent "beliefs and expectations", or other forms of not so straightforwardly explicit knowledge is an on-going area of improvement in information sciences (see Tacit knowledge) and, with relation to KaaS, the establishment of recent informatics mechanics to do so it critical to the legitimacy of KaaS as it is differentiated from just value-added DaaS. Knowledge graphs' ability to represent context via the definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many or all domains of discourse that they provide (see the definition of Ontology) has led to the idea that supplying access to KNs might be a required competency of a KaaS. == Delivery of knowledge == Much service-delivered content is dependent on a session to provide much of the context that the user (client) needs to understand answers to questions. For example, using current HTTP internet protocols, a GET request to retrieve information identified by a URI, such as a web page, a client (a human or a machine) may have access information supplied automatically to enable that client to bypass paywalls or other content access controls. Such context, in this case about the client's information access allowances, can alter the information provided. In a logical extension to this internet protocols example, a server would receive from the client, either manually or automatically, a full context which would be information about the situation the client is in and this would allow the server to best interpret the client's request. Current internet protocols allow for formats, languages and related preferences to be expressed by clients but make no mention of what a client already knows and what they may understand. The recent Content Negotiation by Profile proposes additions to both the HTTP internet protocols and related services that allow clients to also request information - a response from the server - that accords with an identified information model. This then allows clients to indicate not just formats and languages that they understand (technically that they prefer) but also domains of discourse that that do, which is a step towards comprehensive client context provision.

IMPACT (computer graphics)

IMPACT (sometimes spelled Impact) is a computer graphics architecture for Silicon Graphics computer workstations. IMPACT Graphics was developed in 1995 and was available as a high-end graphics option on workstations released during the mid-1990s. IMPACT graphics gives the workstation real-time 2D and 3D graphics rendering capability similar to that of even high-end PCs made well after IMPACT's introduction. IMPACT graphics systems consist of either one or two Geometry Engines and one or two Raster Engines in various configurations. IMPACT graphics consists of five graphics subsystems: the Command Engine, Geometry Subsystem, Raster Engine, framebuffer and Display Subsystem. IMPACT Graphics can produce resolutions up to 1600 x 1200 pixels with 32-bit color and can also process unencoded NTSC and PAL analog television signals. IMPACT graphics subsystems come in three configurations for SGI Indigo2 IMPACT workstations: Solid IMPACT, High IMPACT, and Maximum IMPACT. The equivalent configurations also exist for the SGI Octane workstation but are referred to as SI, SSI, and MXI (I-series). Later Octane workstations used a similar configuration but with updated ASIC chips and are referred to as SE, SSE, and MXE (E-series). IMPACT uses Rambus RDRAM for texture memory. The IMPACT graphics architecture was superseded by SGI's VPro graphics architecture in 1997.

GNU social

GNU social (and its predecessor StatusNet) is a largely defunct free and open-source microblogging social networking service that implements the OStatus and ActivityPub standards for interoperability between installations. While offering similar functionality to social networks such as Twitter, GNU social seeks to provide the ability for open and federated communication between different microblogging communities, known as 'instances'. Both enterprises and individuals can install and control their own instances and user data. At its peak in popularity, GNU social had been deployed on hundreds of interconnected instances, however has since fallen into disuse as competing software like Mastodon and Pleroma have taken its position as the dominant federated microblogging services. Later on in its lifespan, the project split into two separate branches, with "v2" being a continuation of the original codebase for maintenance of existing instances, with "v3" being a complete redesign of the project meant to integrate further ActivityPub support and modernization of the user experience and its technological back-end. As of August 15, 2022, there had been no new commits to the v2 branch, with the v3 branch also no longer being actively developed not long after by November 25, 2022, with the project essentially abandoned. Despite its modern obsolescence and dated design compared to modern platforms, GNU social and StatusNet is regarded to be the origin of the Fediverse network and has had a major influence on the design of more modern decentralized social networks that succeeded it. == History == While being the main project within its lineage, GNU social originally began as a fork of StatusNet. The software was first developed for a service called identi.ca from Evan Prodromou, which offered free microblogging accounts to the public. The software quickly became one of the first popular examples of a decentralized social network, as identi.ca allowed any other server that was running the software to communicate with it, something which had not previously been attempted before in social media at such a large scale. === StatusNet === Originally, StatusNet (named Laconica at the time) was launched with a communication protocol designed specifically for the project called OpenMicroBlogging (OMB). With version 0.8.1, the name of the software was changed to StatusNet. Version 0.9.0 was released soon after in March 3, 2010, with the developers implementing a newly designed protocol dubbed OStatus, with support for OMB being dropped not long after. Compared to OpenMicroBlogging, OStatus could handle and federate more events and actions than the basic plaintext communication that OMB provided and was based on a variety of other web technologies, allowing for easier adoption of new implementations of the protocol for servers and clients compared to the fully custom architecture of OMB. With the StatusNet name change, the company developing both the software and OStatus as well as managing identi.ca rebranded from Control Yourself to StatusNet Inc. In August 2010, the company raised a new round of venture capital funds to establish a hosting service under the status.net domain from sources such as First Mark Capital, BOLDstart Ventures, iNovia Capital and Montreal Start Up, raising over $2.3 million in funding up to that point. The hosting service allowed anyone to establish their own StatusNet instance without maintaining a server, similar to WordPress.com and other blogging platforms. New registrations on identi.ca along with the ability to create new status.net instances was disabled in December 2012, in preparation for a migration to pump.io that has since been named by users of StatusNet and OStatus as "the Pumpocalypse". pump.io was a brand new software package like StatusNet, but with a new protocol designed for general purpose activity streams outside of microblogging and ease-of-use for developers building on the technology, much like the transition from OMB to OStatus. The announcement was seen as unexpected among identi.ca users, who were concerned about the possibility of their statuses being deleted with the transition. At the same time, server administrators running third-party instances and their users who were left behind on StatusNet were also worried, as it was unclear at the time whether future development of the software would be picked up by a new maintainer. The transition for identi.ca users to pump.io was completed on 12 July 2013. ==== Previous names ==== The original name of StatusNet was Laconica, a reference to the Laconic phrase; a particularly brief statement commonly attributed to the leaders of Sparta (Laconia being the Greek region containing Sparta). In microblogging, all messages are designed to be very short due to the traditional 140-character limit on message size, a limitation imported from SMS. Beginning with version 0.8.1, the name was changed to StatusNet. The developers said that the new name "simply reflects what our software does: send status updates into your social network." === GNU social === GNU social originally began as a side project of GNU FM (Libre.fm) maintainer Matt Lee, with the goal of being able to federate messages between Last.fm and other instances of GNU FM using StatusNet plugins. Around the same time, a developer named Mikael Nordfeldth forked StatusNet with the intention of maintaining it as a personal project, dubbing it "Free Social". However, following identi.ca's transition to pump.io and its developers' sudden abandonment of StatusNet, the projects received more attention from server administrators and other users looking for an actively updated alternative. Shortly after LibrePlanet 2012, a plan was formed to merge all three projects into a single service. On June 8, 2013, it was announced that along with Free Social, StatusNet would be merged into the GNU social project and stewarded by the Free Software Foundation, with the project since becoming the dominant variant of StatusNet. During GNU social's lifespan, a popular theme for the user interface named Quitter was used, which was similar to an earlier Twitter interface. Many instances were made specifically using the name Quitter such as Quitter.se, an instance created by the developer of the theme. Before the establishment of Mastodon's popularity and dominance within the network, Quitter was noted as a frequent location for users of Twitter to migrate to when users disagreed with moderation policies or feature updates, such as when an algorithmic feed was added to Twitter. A fork of GNU social was made called postActiv, which planned to rewrite the backend and user interface of GNU social, as well as to add compatibility for Diaspora's protocol. == Features == A basic GNU social instance takes the form of a microblogging service with a reverse chronological timeline that features status updates and small messages from followed accounts, similar to other services such as Twitter or Weibo. While users could see their own customized timeline, they could access another timeline that showcased every message that the instance knows of, including from other instances that were connected to each other if someone on the instance followed an account from it. Users could also create and join groups, which allows for discussion and collaboration on specific topics. Administrators can also customize their server via the plugin system, which allows developers to create new features or modify existing plugins to suit the needs of the instance via PHP. A notable plugin built for GNU social was Quitter, a revamp of the user interface that resembles an earlier version of Twitter's user interface.