- JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning The Four-Element Theory is a fundamental framework in criminal law, defining the constitution of crime through four dimensions: Subject, Object, Subjective aspect, and Objective aspect. This theory is widely referenced in legal reasoning, and many Large Language Models (LLMs) attempt to incorporate it when handling legal tasks. However, current approaches rely on LLMs' internal knowledge to incorporate this theory, often lacking completeness and representativeness. To address this limitation, we introduce JUREX-4E, an expert-annotated knowledge base covering 155 criminal charges. It is structured through a progressive hierarchical annotation framework that prioritizes legal source validity and employs diverse legal interpretation methods to ensure comprehensiveness and authority. We evaluate JUREX-4E on the Similar Charge Distinction task and apply it to Legal Case Retrieval, demonstrating its effectiveness in improving LLM performance. Experimental results validate the high quality of JUREX-4E and its substantial impact on downstream legal tasks, underscoring its potential for advancing legal AI applications. Code: https://github.com/THUlawtech/JUREX 8 authors · Feb 24
- Fast and Accurate Prediction of Material Properties with Three-Body Tight-Binding Model for the Periodic Table Parameterized tight-binding models fit to first principles calculations can provide an efficient and accurate quantum mechanical method for predicting properties of molecules and solids. However, well-tested parameter sets are generally only available for a limited number of atom combinations, making routine use of this method difficult. Furthermore, most previous models consider only simple two-body interactions, which limits accuracy. To tackle these challenges, we develop a density functional theory database of nearly one million materials, which we use to fit a universal set of tight-binding parameters for 65 elements and their binary combinations. We include both two-body and three-body effective interaction terms in our model, plus self-consistent charge transfer, enabling our model to work for metallic, covalent, and ionic bonds with the same parameter set. To ensure predictive power, we adopt a learning framework where we repeatedly test the model on new low energy crystal structures and then add them to the fitting dataset, iterating until predictions improve. We distribute the materials database and tools developed in this work publicly. 2 authors · Dec 21, 2021
- Lifelong Machine Learning Potentials Machine learning potentials (MLPs) trained on accurate quantum chemical data can retain the high accuracy, while inflicting little computational demands. On the downside, they need to be trained for each individual system. In recent years, a vast number of MLPs has been trained from scratch because learning additional data typically requires to train again on all data to not forget previously acquired knowledge. Additionally, most common structural descriptors of MLPs cannot represent efficiently a large number of different chemical elements. In this work, we tackle these problems by introducing element-embracing atom-centered symmetry functions (eeACSFs) which combine structural properties and element information from the periodic table. These eeACSFs are a key for our development of a lifelong machine learning potential (lMLP). Uncertainty quantification can be exploited to transgress a fixed, pre-trained MLP to arrive at a continuously adapting lMLP, because a predefined level of accuracy can be ensured. To extend the applicability of an lMLP to new systems, we apply continual learning strategies to enable autonomous and on-the-fly training on a continuous stream of new data. For the training of deep neural networks, we propose the continual resilient (CoRe) optimizer and incremental learning strategies relying on rehearsal of data, regularization of parameters, and the architecture of the model. 2 authors · Mar 10, 2023
- MACE4IR: A foundation model for molecular infrared spectroscopy Machine-learned interatomic potentials (MLIPs) have shown significant promise in predicting infrared spectra with high fidelity. However, the absence of general-purpose MLIPs capable of handling a wide range of elements and their combinations has limited their broader applicability. In this work, we introduce MACE4IR, a machine learning foundation model built on the MACE architecture and trained on 10 million geometries and corresponding density-functional theory (DFT) energies, forces and dipole moments from the QCML dataset. The training data encompasses approximately 80 elements and a diverse set of molecules, including organic compounds, inorganic species, and metal complexes. MACE4IR accurately predicts energies, forces, dipole moments, and infrared spectra at significantly reduced computational cost compared to DFT. By combining generality, accuracy, and efficiency, MACE4IR opens the door to rapid and reliable infrared spectra prediction for complex systems across chemistry, biology, and materials science. 5 authors · Aug 26
- AB5 type multicomponent TiVCoNiMn2 high-entropy alloy Recent theoretical and practical research has focused on multi-component High Entropy Alloys (HEAs), which have superior mechanical and functional properties than standard alloys based on a single major element, thereby establishing a new field. A multi-component HEA contains five or more primary elements at concentrations ranging from 5 to 35 atomic percent. We examined the microstructure and mechanical properties of TiVCoNiMn2 HEA. The mixing enthalpy and other thermodynamic parameters were determined using Meidma's model. TiVCoNiMn2 exhibits a mixing enthalpy of -15.6 kJ/mol and an atomic radius mismatch of approximately 10.03%. HEA is derived from both hydride and non-hydride-producing elements. This could be a useful hydrogen storage material. The hydrogen absorption/desorption capabilities of these HEAs are promising. 4 authors · Mar 24, 2024
1 Vector-Based Approach to the Stoichiometric Analysis of Multicomponent Chemical Reactions: The Case of Black Powder The study demonstrates the capabilities of a vector-based approach for calculating stoichiometric coefficients in chemical equations, using black powder as an illustrative example. A method is proposed for selecting and constraining intermediate interactions between reactants, as well as for identifying final products. It is shown that even a small number of components can lead to a large number of final and intermediate products. Through concrete calculations, a correlation is established between the number of possible chemical equations and the number of reactants. A methodology is proposed for computing all possible chemical equations within a reaction system for arbitrary component ratios, enabling the derivation of all feasible chemical reactions. Additionally, a method is developed for calculating the chemical composition for a fixed set of reactants, allowing for the evaluation of the set of products resulting from all possible chemical interactions given a specified initial composition. 3 authors · Oct 29 1
1 A Vector-Based Algorithm for Generating Complete Balanced Reaction Sets with Arbitrary Numbers of Reagents We present a vector-based method to balance chemical reactions. The algorithm builds candidates in a deterministic way, removes duplicates, and always prints coefficients in the lowest whole-number form. For redox cases, electrons and protons/hydroxide are treated explicitly, so both mass and charge are balanced. We also outline the basic principles of the vector formulation of stoichiometry, interpreting reactions as integer vectors in composition space, this geometric view supports compact visualizations of reagent-product interactions and helps surface distinct reaction families. The method enumerates valid balances for arbitrary user-specified species lists without special-case balancing rules or symbolic tricks, and it provides a clean foundation for developing new algorithmic variants (e.g., alternative objectives or constraints). On representative examples (neutralization, double displacement, decomposition, classical redox, small multicomponent sets) and a negative control, the method produced correct integer balances. When multiple balances exist, we report a canonical one - minimizing the total coefficient sum with a simple tie-breaker - without claiming global optimality beyond the solutions the search enumerates. The procedure applies per reaction and extends to reaction networks via consistent per-reaction application. We do not report runtimes, broader benchmarking and code/data release are planned. 3 authors · Oct 29
- Next highest weight and other lower SU(3) irreducible representations with proxy-SU(4) symmetry for nuclei with 32 le Z,N le 46 In the applications of proxy-SU(3) model in the context of determining (beta,gamma) values for nuclei across the periodic table, for understanding the preponderance of triaxial shapes in nuclei with Z ge 30, it is seen that one needs not only the highest weight (hw) or leading SU(3) irreducible representation (irrep) (lambda_H, mu_H) but also the lower SU(3) irreps (lambda ,mu) such that 2lambda + mu =2lambda_H + mu_H-3r with r=0,1 and 2 [Bonatsos et al., Symmetry {\bf 16}, 1625 (2024)]. These give the next highest weight (nhw) irrep, next-to-next highest irrep (nnhw) and so on. Recently, it is shown that for nuclei with 32 le Z,N le 46, there will be not only proxy-SU(3) but also proxy-SU(4) symmetry [Kota and Sahu, Physica Scripta {\bf 99}, 065306 (2024)]. Following these developments, presented in this paper are the SU(3) irreps (lambda ,mu) with 2lambda + mu =2lambda_H + mu_H-3r, r=0,1,2 for various isotopes of Ge, Se, Kr, Sr, Zr, Mo, Ru and Pd (with 32 le N le 46) assuming good proxy-SU(4) symmetry. A simple method for obtaining the SU(3) irreps is described and applied. The tabulations for proxy-SU(3) irreps provided in this paper will be useful in further investigations of triaxial shapes in these nuclei. 1 authors · Oct 1
- Synergistic Fusion of Multi-Source Knowledge via Evidence Theory for High-Entropy Alloy Discovery Discovering novel high-entropy alloys (HEAs) with desirable properties is challenging due to the vast compositional space and complex phase formation mechanisms. Efficient exploration of this space requires a strategic approach that integrates heterogeneous knowledge sources. Here, we propose a framework that systematically combines knowledge extracted from computational material datasets with domain knowledge distilled from scientific literature using large language models (LLMs). A central feature of this approach is the explicit consideration of element substitutability, identifying chemically similar elements that can be interchanged to potentially stabilize desired HEAs. Dempster-Shafer theory, a mathematical framework for reasoning under uncertainty, is employed to model and combine substitutabilities based on aggregated evidence from multiple sources. The framework predicts the phase stability of candidate HEA compositions and is systematically evaluated on both quaternary alloy systems, demonstrating superior performance compared to baseline machine learning models and methods reliant on single-source evidence in cross-validation experiments. By leveraging multi-source knowledge, the framework retains robust predictive power even when key elements are absent from the training data, underscoring its potential for knowledge transfer and extrapolation. Furthermore, the enhanced interpretability of the methodology offers insights into the fundamental factors governing HEA formation. Overall, this work provides a promising strategy for accelerating HEA discovery by integrating computational and textual knowledge sources, enabling efficient exploration of vast compositional spaces with improved generalization and interpretability. 9 authors · Feb 20
- Symmetry-invariant quantum machine learning force fields Machine learning techniques are essential tools to compute efficient, yet accurate, force fields for atomistic simulations. This approach has recently been extended to incorporate quantum computational methods, making use of variational quantum learning models to predict potential energy surfaces and atomic forces from ab initio training data. However, the trainability and scalability of such models are still limited, due to both theoretical and practical barriers. Inspired by recent developments in geometric classical and quantum machine learning, here we design quantum neural networks that explicitly incorporate, as a data-inspired prior, an extensive set of physically relevant symmetries. We find that our invariant quantum learning models outperform their more generic counterparts on individual molecules of growing complexity. Furthermore, we study a water dimer as a minimal example of a system with multiple components, showcasing the versatility of our proposed approach and opening the way towards larger simulations. Our results suggest that molecular force fields generation can significantly profit from leveraging the framework of geometric quantum machine learning, and that chemical systems represent, in fact, an interesting and rich playground for the development and application of advanced quantum machine learning tools. 5 authors · Nov 19, 2023
- Measurement of the electric dipole moment of AlCl We report the measurement of the electric dipole moment of aluminum monochloride (AlCl) using a cryogenic buffer-gas beam source. Our measurements provide values for the dipole moments of the two lowest vibrational states of the X^1Sigma^+ and the A^1Pi electronic states. We also show that spin-orbit coupling with an extended number of spin states is essential in the ab initio calculation to correctly describe both the dipole moment and the Te energy of AlCl. We further lay out the implications of these results for astrophysical models of stellar and planetary evolution that have used a substitute value for the dipole moment of AlCl until now. 5 authors · Mar 17